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PREFACE 


There  are  two  unusual  features  of  the  writeup  of  this 
study  which  should  be  mentioned  at  the  outset. 

First,  due  to  the  large  number  of  symbols  used,  it  was 
felt  that  the  reader  would  benefit  from  a  glossary  of  symbols. 
Therefore,  such  a  glossary  has  been  included  at  the  extreme 
end  of  the  report.  As  Appendix  F  has  been  taken  from  another 
report  of  the  author  and  it  contains  self-explained  symbols,  the 
above  mentioned  glossary  does  not  apply  to  that  appendix. 

Secondly,  to  make  the  reading  of  the  main  text  easier, 
a  considerable  amount  of  theoretical  development  and  numerical 
results  have  been  placed  in  appendices.  It  is  hoped  that  this  and 
the  above  feature  will  help  convey  the  results  of  the  research. 
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ABSTRACT 


In  most  Markov  process  studies  to  date  it  has  been  assumed  that  both 
the  transition  probabilities  and  rewards  are  known  exactly.  The  primary  pur¬ 
pose  of  this  thesis  is  to  study  the  effects  of  relaxing  these  assumptions  to 
allow  more  realistic  models  of  real  world  situations.  The  Bayesian  approach 
used  leads  to  statistical  decision  frameworks  for  Markov  processes. 

The  first  section  is  concerned  with  situations  where  the  transition 
probabilities  are  not  known  exactly. 

One  approach  used  incorporates  the  concept  of  multi -matrix  Markov 
processes,  processes  where  it  is  assumed  that  one  of  several  known  transi¬ 
tion  matrices  is  being  utilized,  but  we  only  have  a  probability  vector  on  the 
various  matrices  rather  than  knowing  exactly  which  one  is  governing  the 
process.  An  explanation  is  given  of  the  Bayes  modification  of  the  proba¬ 
bility  vector  when  some  transitions  are  observed.  Next,  we  determine 
various  quantities  of  interest,  such  as  mean  recurrence  times.  Finally  a 
discussion  is  presented  of  decision  making  involving  multi-matrix  Markov 
processes. 

The  second  approach  assumes  more  directly  that  the  transition 
probabilities  themselves  are  random  variables.  It  is  shown  that  the 
multidimensional  Beta  distribution  is  a  most  convenient  distribution 
(for  Bayes  calculations)  to  place  over  the  probabilities  of  a  single  row 
of  the  transition  matrix.  Several  important  properties  of  the  distribution 
are  displayed.  Then  a  method  is  suggested  for  determining  the  multi¬ 
dimensional  Beta  prior  distributions  to  use  for  any  particular  Markov 
process.  Next  we  deal  with  the  effects  on  various  quantities  of  interest 
of  having  such  distributions  ever  the  transition  probabilities.  For  2- 
state  processes,  several  analytic  results  are  derived.  Despite  analytic 
complexities,  some  interesting  expressions  are  developed  for  N-state 
cases. 


It  is  shown  that  for  decision  purposes  the  expected  values  of  the 
steady  state  probabilities  are  important  quantities.  For  a  special  2- 
state  situation,  use  of  the  hypergeometric  function  (previously  utilized 
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in  the  solution  of  certain  physics  problems)  permits  evaluation  of 
these  expected  values.  Their  determination  for  3  or  more  states 
requires  the  use  of  simulation.  Fortunately,  a  simple  approxima¬ 
tion  technique  is  shown  to  generally  give  accurate  estimates  of 
the  desired  quantities.  An  entire  chapter  is  devoted  to  statistical 
decisions  in  Markov  processes  when  the  transition  probabilities 
are  multidimensional  Beta  distributed  rather  than  being  exactly 
known.  The  main  problem  considered  is  one  where  we  have  the 
option  of  buying  observations  of  a  Markov  process  so  els  to  im¬ 
prove  our  knowledge  of  the  unknown  transition  probabilities  be¬ 
fore  deciding  whether  or  not  to  utilize  the  process. 

In  the  second  section  of  the  study,  we  assume  that  the  tran¬ 
sition  probabilities  are  exactly  known,  but  now  the  rewards  are 
random  variables.  First  we  display  the  Bayes  modification  of 
two  convenient  distributions  to  use  for  the  rewards.  Next,  the 
expected  rewards  in  various  time  periods  are  determined.  Fin¬ 
ally,  an  explanation  is  presented  of  how  to  utilize  these  expected 
rewards  in  making  statistical  decisions  concerning  Markov  pro¬ 
cesses  whose  rewards  are  not  known  exactly. 


Thesis  Supervisor:  Ronald  A.  Howard 

Title:  Associate  Professor  of  Electrical  Engineering  and 
Associate  Professor  of  Industrial  Management 
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CHAPTER  1 


INTRODUCTION 

It  is  becoming  increasingly  evident  that  Markov  process  models 
are  playing  a  more  and  more  important  role  in  the  mathematical  analysis 
of  both  military  and  industrial  operations.  This  increased  practical 
importance  necessitates  further  research  in  the  fundamentals  of  Markov 
process  theory.  This  study  is  an  effort  in  this  direction. 

Practically  all  Markov  process  applications  to  date  have  assumed 
that  all  the  relevant  parameters  (i.e.  the  rewards  and  transition  proba¬ 
bilities)  are  exactly  known.  In  many  situations  this  has  been  and  will 
continue  to  be  a  very  debatable  assumption.  The  purpose  of  this  dissertation 
is  to  study  the  effects  of  relaxing  these  assumptions  to  allow  more  realistic 
models  of  the  real  world  situation.  The  approach  used  leads  to  a  statistical 
decision  framework  for  Markov  processes. 

Since  Bartlett's  pioneering  work  ^  in  1951  several  theoretical  studies 
have  been  devoted  to  the  problem  of  estimating  the  transition  probabilities 
of  Markov  processes  (see  Billingsley's  article^  for  an  excellent  reference 
list).  These  studies  have  been  concerned  with  statistical  methods  that 
result  in  point  estimates  (i.e.  exact  single  values)  of  the  transition  proba¬ 
bilities.  In  the  present  research  a  different  approach,  that  of  a  Bayesiap, 
analyst,  is  to  be  used. 


1.  Bartlett,  M.  ,  "The  Frequency  Goodness  of  Fit  Test  for  Probability 
Chains",  Cambridge  Philosophical  Proceedings,  1951  (47),  p.  86. 

2.  Billingsley,  P.  ,  "Statistical  Methods  in  Markov  Chains", 

Annals  of  Mathematical  Statistics  (U.  S. ),  Vol.  32  (1961), 

No.  1,  pp.  12-40. 
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Prior  knowledge  about  the  unknown  parameters  (transition 
probabilities  or  rewards)  is  expressed  in  the  form  of  probability  distr; 
butions  over  the  unknown  parameters;  i.e.,  the  parameters  are  con¬ 
sidered  as  random  variables.  These  prior  probability  distributions 
are  modified  through  the  use  of  Bayes*-  rule  when  observations  (perhaps 
in  the  form  of  transitions  or  rewards)  of  the  Markov  process  are  made. 
More  precisely,  the  a  posteriori  distribution  is  given  by 


fv  „  „  (y,«y2 . yjs,) 

xl,x2’  *  *  *  ’  Xn  n  1 


prior 


pr(Eilyry2 . yn)fx,,x, . x  lyi’Vv  —  yJ 

i  c  n 


Pr(E.) 


a  direct  consequence  of  Bayes  1  rule.  x. ,  xn,  ...  *  x  '  are  the  unknown 
parameters  and  £  denotes  the  observations.  The  a  posteriori  distri¬ 
butions  over  the  parameters,  unlike  point  estimates,  clearly  reflect 
our  uncertainty  as  to  the  exact  values  of  the  parameters  and  this 
uncertainty  can  be  incorporated  when  making  statistical  decisions  about 
the  process.  An  excellent  description  of  the  philosophy  of  the  Bayes 


* 

The  notation  is  described  in  footnote  number  12  found  on  page  36. 
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approach  has  been  presented  by  Savage.  Therefore,  we  refer  the 
interested  reader  to  Savage's  article  for  a  justification  of  this  approach 
rather  than  reproducing  his  ideas  here.  Fortunately,  his  article  also 
contains  an  extensive  list  of  references  on  Bayes  procedures.  In  f=<ct, 
it  is  probably  the  best  starting  point  to  become  familiar  with  the  entire 
area  pf  Bayesian  statistics. 

The  following,  in  itself,  is  an  important  reason  for  using  a 
Bayes  approach  in  the  study  of  Markov  processes.  Quite  often  prior  to 
observational  data  the  analyst  knows  considerable  information  about  the 
unknown  parameters,  but  not  enough  for  him  to  state  outright  that  the 
parameters  can  be  assigned  exactly  known  values.  If  the  observations 
have  significant  costs  associated  with  them,  the  analyst  must  decide 
exactly  how  many  observations  are  worthwhile.  A  Bayes  approach  allows 
him  to  make  his  decision  in  a  quantitative  manner. 

In  Section  I  we  shall  be  concerned  with  situations  where  the 
transition  probabilities  are  not  known  exactly.  More  precisely,  Chapter  2 
assumes  that  we  have  e  probability  vector  over  several  possible  transi¬ 
tion  matrices . 

q=  [  o j  i  u  ^  ’  •  •  •  ’  l  •  •  •  i  Uq  ] 

where  a  is  the  probability  that  matrix  P  is  being  used  (k~l,  2,  .  .  .  ,  G). 

K 

Then,  in  Chapters  3,  4  and  5  we  assume  more  directly  that  the  transi¬ 
tion  probabilities  themselves  are  random  variables.  Chapter  6  involves 
a  brief  look  at  continuous  time  processes  where  the  transition  rates  are 
random  variables  rather  than  being  exactly  known. 

^aSavage,  L.  J, ,  "Bayesian  Statistics,  "  Recent  Developments  in  Informa¬ 
tion  and  Decision  Processes,  Macmillan,  1962. 
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The  approach  of  placing  probability  distributions  directly  over 
the  transition  probabilities  is  probably  more  appealing  than  that  used  in 
,'hapter  2  where  we  deal  with  multi-matrix  Markov  processes,  processes 
where  it  is  assumed  that  one  of  Q  known  transition  matrices  is  being 
utilized/but  we  only  have  a  probability  vector  on  the  various  matrices 
rather  than  knowing  exactly  which  one  is  being  used.  However,  the 
multi -matrix  approach  does  make  the  mathematical  theory  and  computations 
considerably  simpler  in  certain  portions  of  the  study.  Also,  as  will  be 
explained  later,  it  can  be  thought  of  as  a  step  toward  the  combination  of 
Mar'.ov  process  theory  and  game  theory. 

In  Chapter  2,  with  the  multi-matrix  framework  assumed,  a 
detailed  explanation  is  presented  of  the  Bayes'  modification  of  the 
probability  vector  (a)  when  some  transitions  are  observed.  Next  we 
determine  various  quantities  of  interest,  such  as  mean  recurrence 
times.  Finally,  decision  making  involving  multi -matrix  Markov 
processes  is  discussed 

As  mentioned  earlier,  in  Chapters  3,  4  and  5,  we  assume  directly 
that  the  transition  probabilities  themselves  are  random  variables.  Un¬ 
fortunately  the  analytic  considerations  become  far  more  formidable  than 
those  encountered  in  Chapter  2.  Still,  many  interesting  and  potentially 
useful  results  are  developed. 

Chapter  3  is  concerned  with  the  multidimensional  Beta  distribution, 

a  most  convenient  distribution  to  place  over  the  transition  probabilities  of  a 

single  row  of  the  transition  matrix.  In  research  work  concurrent  with  this 

3  4 

study  two  other  individuals,  Murphy  and  Mosimann  ,  have  developed  this 

3  Murphy,  Roy  E  Jr.  ,  Adaptive  Processes  in  Economic  Systems,  Technical 
Report  No  119,  Institute  for  Mathematical  Studies  in  the  Social  Sciences, 
Stanford  University,  1962. 

4  Mosimann,  J.  E.,  "On  the  Compound  Multinomial  Distribution,  the 
Multivariate  (3  -distribution,  and  Correlations  among  Proportions", 
Biometrika,  Vol  49  (1962),  pp  65-82. 
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distribution  as  a  convenient  one  to  place  over  the  parameters  of 
a  multinomial  distribution.  However,  their  purposes  in  doing 
this  did  not  include  the  study  of  Markov  processes.  After  the 
important  properties  of  this  distribution  are  listed,  a  method  is 
suggested  for  determining  the  a  priori  parameters.  Next,  the 
Bayes  modification  of  the  distribution  is  outlined.  Then  we  con¬ 
cern  outselves  with  the  development  of  the  multidimensional  Beta 
priors  for  a  specific  N  -  state  Markov  process.  Finally,  two 
methods  of  simulating  a  multidimensional  Beta  distribution  are 
suggested.  This  simulation  is  required  later  in  the  study. 

Chapter  4  deals  with  the  effects  on  various  quantities  of 
interest  of  having  multidimensional  Beta  distributions  over  the 
transition  probabilities.  A  particularly  detailed  analytical  study 
is  made  of  the  expected  values  of  the  steady  state  probabilities  in 
the  2  -  state  case.  I'or  3  or  more  states,  analytic  complexities 
necessitate  the  use  of  simulation.  Detailed  results  of  simulations 
for  3  and  4  state  processes  are  presented.  First  passage  times, 
state  occupancy  times,  transient  behavior  and  a  trapping  state 
situation  are  also  studied. 

In  Chapter  5  we  attack  the  problem  of  making  statistical 
decisions  in  Markov  processes  when  the  transition  probabilities 
are  not  known  exactly.  First,  the  importance  of  the  expected  values 
of  the  steady  state  probabilities  is  displayed.  Then  we  consider  a 
specific  2  -  state  problem  where  the  decision  is  based  upon  the  ex¬ 
pected  reward  in  s  periods  in  the  steady  state  and  we  have  the  option 
of  buying  observations  of  the  process  so  as  to  improve  our  knowledge 
of  the  unknown  transition  probability.  The  next  section  deals  with 
an  analogous  N  -  state  problem.  After  this,  several  other  items  of 
interest  for  statistical  decision  purposes  are  discussed.  Finally, 
we  look  at  statistical  decisions  under  a  transient  situation. 
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In.  Section  II  we  assume  that  the  transition  probabilities  are 
exactly  known  but  now  the  rewards  are  random  variables.  First,  in 
Chapter  7,  a  presentation  is  made  of  two  convenient  distributions  to  use 
for  the  rewards.  Bayes  modification  of  them  is  also  displayed.  Then 
Chapter  8  is  concerned  with  the  determination  of  the  expected  rewards 
in  various  time  periods  (both  steady  state  and  transient).  Chapter  9 
shows  how  to  utilize  these  expected  rewards  in  making  statistical 
decisions  in  Markov  processes  where  the  rewards  are  not  known  exactly. 

Chapter  10,  entitled  "Conclusions ",  summarizes  the  more 
important  points  of  the  study  and  also  suggests  several  related  areas 
for  further  research. 
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SECTION  I 


TRANSITION  PROBABILITIES  NOT  KNOWN  EXACTLY 

In  this  section  we  remove  the  usual  Markov  requirement  that  the 
transition  probabilities  be  known  exactly;  rather  we  assume  that  the 
transition  probabilities  (or  rates  or  matrices)  are  themselves  random 
variables.  There  are  three  primary  objectives.  The  first  is  to  deter¬ 
mine  a  reasonably  simple  method  for  placing  convenient  distributions 
over  the  transition  probabilities  (or  rates  or  matrices).  Secondly,  we 
want  to  be  able  to  easily  modify  these  distributions  through  the  use  of 
Bayes’  rule  after  observing  some  transitions.  Finally, it  is  important 
to  know  the  effects  on  various  quantities  of  interest  of  having  distri¬ 
butions  over  the  transition  probabilities  (or  rates  or  matrices).  This 
latter  consideration  leads  to  a  statistical  decision  framework  for 
Markov  processes. 
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CHAPTER  2 


THE  MULTI- MATRIX  MARKOV  PROCESS 


2.  1  Outline  of  the  Process 

Instead  of  placing  probability  distributions  directly  over  the 
transition  probabilities  (as  will  be  done  in  Chapters  3-5)  we  proceed 
as  follows:  It  is  assumed  that  the  Markov  process  is  governed  by  one 
of  Q  given  transition  matrices  whose  elements  are  exactly  known.  This 
one  matrix  is  always  used  but  we  are  not  sure  as  tOi  exactly  which  one  of 
the  Q  matrices  it  is.  In  fact,  we  define  a  probability  vector 


where  a  .  -  probability  that  matrix  j  is  governing  the  process.  A  process 
defined  in  this  way  is  called  a  multi-matrix  Markov  process. 

With  this  situation  existing;we  show  in  section  2.2  how  to  update 
the  a  vector  (by  Bayes'  rule)  when  several  transitions  of  the  process  are 
observed.  Next  we  discuss  the  problem  of  determining  various  quantities 
of  interest  such  as  the  steady  state  probabilities.  In  section  2.4  considera 
tion  is  given  to  possible  cost  structures  for  the  .multi -matrix  Markov 
process.  Finally,  in  section  2.5  we  demonstrate  statistical  decision  theory 
for  multi-matrix  Markov  processes  by  analyzing  a  specific  decision  problem. 

Although  the  multi-matrix  Markov  framework  is  less  appealing  in  a 
physical  sense  than  is  placing  probability  distributions  directly  over  the 
transition  probabilities,  it  does  make  the  mathematical  theory  and  compu¬ 
tations  considerably  simpler  in  certain  portions  of  the  study.  Moreover, 
it  can  be  thought  of  as  a  step  toward  the  combination  of  Markov  process 
theory  and  game  theory.  The  multi-matrix  process  can  be  considered  as 
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follows.  An  opponent  (or  nature)  selects  one  of  Q  known  transition 
matrices  with  which  to  run  the  Markov  process^but  we  only  know 
probabilistically  which  one  has  been  selected.  We  then  have  to  decide 
upon  our  best  course  of  action  (as  regards  using  the  process,  etc.)  under 
these  circumstances.  Note,  however,  that  it  is  assumed  that  the  opponent 
(or  nature)  can  not  switch  matrices  once  a  choice  has  been  made.  The 
next  step  from  the  game  theoretic  point  of  view  would  be  to  allow 
switching.  Such  a  situation  will  not  be  considered  here. 

2.  2  Bayes  Modification  of  the  a  Vector 
2.  2.  1  Ignoring  the  Starting  State 


Let  B  (F)  be  the  event  that  a  number  of  consecutive  transitions  are 

observed  having  a  frequency  count  F  =  (f.  .)  where  f.  .  -  number  of  transitions 

ij  ij 

from  state  i  to  state  j.  Two  points  should  be  noted.  First,  we  do  not 
require  that  the  exact  order  of  transitions  be  known.  Secondly,  we 
assume  for  the  moment  that  the  starting  state  provides  no  information 
about  which  matrix  is  being  used.  This  assumption  could  be  satisfied  in 
one  of  two  ways;  either  the  process  is  forced  to  start  in  a  particular  state 
regardless  of  which  matrix  is  being  used,or  we  have  absolutely  no  idea  as 
to  which  state  was  the  starting  one. 


1  2  k 

Let  the  Q  possible  matrices  be  designated  by  P,  P,  .  .  .  P, 

.  ,  ,  12  k  Q  ,  T, 

with  elements  p.  .,  p.  .....  ,  p.  ,  .  .  .  p.  .  respectively.  Then 
rij  ij  ij  ij 


pr  (B(F)  |  kP)  =  N(F).II  (  Pj  :) 

i  J  =  1  J 


1  J 


o°—  1 


it  — •  • 

where  JI  (  p.  .  )  1  ^  is  the  probability  of  a  particular  sequence  that 

i  J  =  1  1  J 

would  produce  the  frequency  count  F  given  that  matrix  P  is  being  used, 
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and  N(F)  is  the  number  of  such  sequences. N(F)  is  independent  of  the  P 
matrix  used,  hence  will  cancel  out  in  the  Bayes  calculations.  This  is 
fortunate  as  the  expression  for  N  (F)  is  extremely  complicated. 


Suppose  that  prior  to  the  event  B  ( F)  we  have  a  probability  vector 

a'  =  (a  a  ' . a,  a.')  where  a,  '  is  the  probability  that 

—  12  k  Q  k 

matrix  P  is  being  used.  Then  utilizing  Bayes'  rule  the  posterior 
probability  that  matrix  P  is  being  used  is 


■  pr(kp|  E(F„.  r^iaL|pUr(kp> 

N(F)  n  (kpi?)  ljak' 
i.  j  =  i  1J  k 


Q 


;i 


N  f. 

N(F)  II  (mp..)  1JQ  ' 
i .  j  =  1  ij  m 


N 


ak'  ■  -11  ,  ^Pij)  'lj 
K  i,j  =  l  lJ 


>  a  '  .  ff  ,  (mp..) 
L  m  i.  j  =  i  ij 


f.. 

1J 


m=l 


•  • (2.  1) 


Hence,  the  determination  of  the  a  posteriori  probabilities  is  a  relatively 
simple  operation. 

Numerical  Example 

Bob  confronts  his  friend  Ray  with  an  interesting  problem.  Bob  has 
two  cages  with  an  interconnecting  door.  He  is  also  the  owner  of  two  white 
mice  that  are  identical  in  appearance  The  behavior  of  these  mice  in  the 


5 


Whittle,  P  ,  "Some  Distribution  and  Moment  Formulae  for  Markov 
Chains",  Journal  of  the  Royal  Statistical  Society,  Series  B,  Vol.  17 


(1955),  pp.  235  -  242. 
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cages  is  exactly  known,  to  Ray.  Mouse  no.  1  has  a  transition  matrix 


‘p,1 

0.  1 

0.  9 

2 

0.7 

0.  3 

the  corresponding 

matr- 

1 

2 

.  1 

"0.  5 

0.  5“ 

2 

0.4 

0.  6 

Bob  also  informs  Ray  that  he  has  selected  mouse  no.  1  with  probability 
0  8  and  mouse  no  2  with  probability  0.  2  and  once  selected  the  mouse  is 
never  replaced,  i.e.  the  same  mouse  will  always  be  in  the  cages. 

Now  suppose  Ray  has  observed  six  transitions  (time  periods) 
with  frequency  count 


What  should  be  his  a  posteriori  probabilities  that  each  of  the  two  mice 
are  in  use  ? 


This  is  a  multi-matrix  Markov  process  with  N~2  and  Q~Z. 

Using  equation  (2.  1) 

=  0.  8  [  (0.  1)2(0.9)2(0.  7)2(0.  3)°  ] _ 

0.  8  [  (0.  1)2(0.9)2(0  7)2(0  .  3)°]  +0.2[  (0 . 5)2(0 . 5)2(0 . 4)2(0 . 6'/  ] 
rt  .137 

i.e.  a  "  -  [  .  137,  .  863] 
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It  is  apparent  that  the  six  transitions  have  radically  altered  Ray's 
feelings  about  which  mouse  is  being  used.  Of  course,'  this  is  primarily 
due  to  the  two  1-1  transitions  which  are  very  unlikely  if  mouse  no.  1  is 
in  action.  However,  this  simple  example  does  illustrate  an  important 
point.  If  the  possible  matrices  are  significantly  different,  considerable 
influence  is  played  by  the  transition  data  in  obtaining  the  a  posteriori 
probabilities  even  if  very  few  transitions  are  observed. 

2.2.2  Incorporation  of  the  Starting  State 

In  many  situations  it  is  meaningful  to  incorporate  the  knowledge  of 
the  starting  state  into  the  Bayesian  modification.  Suppose  that  the 
observed  transition  sequence  started  in  state  s.  Then,  if  the  probability 
that  the  starting  state  is  s  is  a  function  of  the  specific  matrix  used,  this 
information  should  be  utilized. 


Let  s^  be  the  probability  that  the  starting  state  is  s  given  that 


matrix  P  is  governing  the  process  and  let  (B(F),  s)  be  the  joint  event 

that  the  starting  state  is  s  and  a  sequence  is  observed  having  a  frequency  count  F. 

Then 


a  "  =  pr(kp|  E  (F),  s)  = 
k 


a  s 
k  k 


n  >  f.. 

ii  (  p.,)  ij 

i,  j  =  1  1J 


Q 

I 

m  =  l 


a  s 
m  m 


N 

II 

i.  j  =  1 


<%»  ‘j 


s  would  generally  be  calculated  in  one  of  two  ways: 

K 


(2.2) 


i)  If  we  could  assume  that  the  sequence  of  transitions  was  observed  when  the 
process  was  in  thr  teady  state,  then 

s  =  7T.  ,  the  s  lady  state  probability  of  being  in  state  s  given  that 

K  S 

k_  . 

matrix  P  is  in  operation. 
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ii)  If  we  were  told  that  the  process  was  forced  to  be  in  state  u  n  time 
periods  before  the  sequence  of  observations,  then 

s^  =  4>ug(n),  the  probability  that  the  state  at  time  n  is  s  given 

that  the  state  at  time  0  is  u  and  that  matrix  is  being  used. Howard^ 
describes  several  methods  for  obtaining  this  quantity. 

Numerical  Efetinple 

For  the  mouse  example  of  the  previous  section  suppose  that  Ray 
was  also  informed  that  the  observed  transitions  occurred  when  the  mouse 
had  been  in  the  cages  for  a  long  time  and  the  first  observed  tiansition 
was  from  cage  1. 


0.  1 

0.  9 

2P  = 

0.  5 

0.  5 

0.  7 

0.  3_ 

0.4 

0.  6_ 

~Z 

Z 

o'  = 

CO 

1° 

°.  z\ 

F  = 

_z 

0_ 

Now  we  solve  n  P  =  it  and  I  P  =  It.  to  obtain  it  =  [  7/16,  9/16] 
2  '  — 

and  it  =  [  4/9,  5/9]  respectively.  Then  using  equation  (2.  2)  there 
results 

a"  =  [  .  135,  .  865] 

The  probability  of  mouse  no.  1  has  been  reduced  still  further  by  the 
knowledge  of  the  starting  condition. 


6  Howard,  R.  A.  ,  Dynamic  Probabilistic  Systems,  in  preparation. 
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2.2.3  a'  Unknown 


When  a1  is  completely  unknown  one  might  argue  that  a  good 
approximation  would  be  achieved  by  using 


m  =  1 


However  this  is  equivalent  to  assuming  a^'  =  and  then  using  the 

starting  information  as  earlier,  for 


1 


a  Sk 

Q 


^  "QSm 
m=  1 


s 


k 


& 


1 


m 


m=  1 


2.  3  Determination  of  Various  Quantities  of  Interest 

Many  deterministic  quantities  in  a  regular  Markov  process  become 
random  variables  in  the  multi-matrix  Markov  process.  Furthermore,  all 
these  random  variables  are  of  the  discrete  variety  because  of  the  discrete 
nature  of  the  probability  distribution  on  the  matrices. 

2.  3.  1  Mean  Recurrence  Times 

For  a  known  transition  matrix  we  can  easily  obtain  the  mean 

7  8 

recurrence  times  through  the  use  of  matrix  or  flow  graph  techniques 

7 .  Ibid 

8  Kemeny,  J.  G.  ,  and  Snell,  J.  L.  ,  Finite  Markov  Chains, 

Van  Nostrand,  I960,  p.  79> 


-14- 


The  mean  recurrence  time  for  state  i  is  the  mean  time  to  go  from  that 
state  to  itself  (staying  in  the  state  on  the  first  transition  results  in  a 
recurrence  time  of  length  1). 

With  matrix  P  we  have 


k_  k_  k  _  k_ 

2.  =  (  nll'  n?z .  n  NN 

k _  i  k 

where  n  =  mean  recurrence  time  for  state  i  |  matrix  Pis  being  used. 

k 

However,  because  P  is  only  being  used  with  probability  a  ,  the  mean 

K 

recurrence  times  now  become  random  variables  with  probability  mass 

9 

functions 


pr  (  n. .  =  k  n. .  )  =  p_  ( k n. .  )  =  a 

li  u  n..  n  k 

n 


k  =  1,  2 . Q 


This  random  variable  will  have 
a  mean  value  given  by 

Q 

E  (n.. )  =  \  a  k  n.. 
n  k  n 

k  =  1 


The  probability  mass  function  is 

p  (k)  =  pr  {  that  a  discrete  random  variable  x  takes  on  the 

value  k  >  k  =  k,  ,  k_ . k 

J  12  n 

Also  p  |  (k  |  j)  =  pr.£  that  x  takes  on  the  value  k|  the  variable 


y  takes  on  the  value  j^-  k  =  k^, 


z’ 


,  k 
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Numerical  Example 


Again  consider  the  mouse  example.  Given  that  a  mouse  is  seen 
in  cage  i  (i  =  1,  2),  what  is  the  expected  number  of  transitions  until  the 
moment  when  he  first  appears  again  in  the  same  cage  ? 


0. 1 

0.9 

’0.  5 

0.  5 

2P  = 

0. 7 

0.  3 

0.4 

0.6 

a  =  [  0.  8,  0.  2] 

It  is  shown  in  Appendix  N  that  for 


P  = 


1  -a 


n 


11 


a  +  b 
b 


and  n22 


a  4-  b 
a 


1  _  0. 9  +  0.  7 

n  11  0.7 


16  , 

—  ,  the  mean  time  for  mouse  no.  1  -tb  go 


from  cage  1  to  cage  1.  Similarly 


1  _  lb_  2_  9.  2  _  9 

n  22  9  nll  "  4;  n  22  ~  5 


Hence,  n  and  n,  have  the  following  probability  mass  functions; 

11  ML 
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Cage  1 


r-inf*  7 


pn  (y) 
n22 


(Mean) 


0.  8  (Mouse  no.  1) 

|  0.  2  (Mouse  no.  2) 

M  1/5 - y 

1.  78 
(Mean) 


Figure  2.1-  The  Probability  Mass  Functions  of  the  Mean  Recurrence  T: 
in  a  Multi-Matrix  Markov  Process 

Similar  probability  mass  functions  could  be  obtained  for  tf  the 

ij 

mean  time  to  go  from  cage  i  to  cage  j,  with  i  ^  j. 

2.3.2  Steady  State  Probabilities 

When  matrix  P  is  being  used  the  steady  state  probability  vector 
is  given  by 

k  r  k  k  k  , 

l  =  [  'r  '2 .  ’N) 

As  with  the  mean  recurrence  times  we  can  think  of  the  steady  state 
probabilities  as  random  variables 


k  =  1 . Q 


k  k 

e.g.  pr(7T.=  w  .  )  =  p^  [v.)  =  a 

i  i  7i  .  i  k 

i 

or  more  directly  we  can  say  that 

Q 

E(tt  .)  =  ^  N r  the  mean  value  of  l r  is  the  expected 

k  =  l 


probability  of  being  in  state  i  in  the  steady  state. 


mes 
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Numerical  Example  (same  as  in  previous  section) 


E  (tt  j)  =  0 .  8  (  y|-)  +  0 . 2  (  ^)  =  .  439,  this  is  the  expected 

probability  that  a  look  at  cage  1  after  a  long  time  will  find  a  mouse  in 
that  cage  during  the  period  in  which  the  look  is  taken.  Also 

E(n  z)  =  0.  561 

2  3.3  Transient  Behavior 

As  was  done  in  section  2.  2.  2  let  ^  .  (n)  =  pr.  {  state  at  time  n 

k  ^ 

is  j  |  state  at  time  0  is  i  and  matrix  P  is  in  use  }  .  The  unconditional 
multi-step  transition  probability,  4..(n),  will  again  have  a  probability 
mass  function  and  we  can  say  that 

Q 

E  £  .  (n)j  =  y  Qk  k<t>.  .(n),  the  mean  value  of  <t>  ..  (n),  is 

k~l 

the  expected  probability  of  being  in  state  j  at  time  n  given  that  the  state 
at  time  0  is  i. 

Numerical  Example  (same  as  in  section  2.  3.  1). 

Again  using  either  flow  graphs  or  matrix  inversion  we  obtain 


1  4>  /  \  9 

12  (P)  =  16 


iT  (-°-6>n 


This  is  the  probability  that  mouse  no.  1  will  be  in  ca 
no.  1  was  in  cage  1  at  time  0  and 


n  >0. 

ge  2  at  time  n| 


mouse 


2  V1  =  1  -  I  (01)” 


This  is  the  same  as  above  except  for  mouse  no.  2. 


n2r0  • 
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E  [-<t>12(n)]  =  0.561  -  0. 450  (-0.  6)n  -0.111  (0.  l)n 


n  >0 


This  is  the  probability  that  a  mouse  will  be  in  cage  2  at  time  n  |  a  mouse 
was  in  cage  1  at  time  0. 

2.  3.  4  Mean  State  Occupancy  Time 

Let  u  be  the  mean  number  of  transitions  to  exit  from  state  i  | 
i 

matrix  P  is  being  used  and  the  process  is  in  state  i  before  the  first 
transition. 


k_ 


u  . 

l 


(mean  of  a  geometric  distribution  with 
k  . 

parameter  p..) 


Now  u.  will  have  a  probability  mass  function  given  by 


pr  (u. 


,k_  . 

(  u.) 


k  =  1,  2 . Q. 


2.  4  Addition  of  a  Cost  Framework 

There  are  several  possible  cost  frameworks  for  the  multi-matrix 
Markov  process.  Two  will  be  discussed  in  detail  and  a  third  will  be 
mentioned. 

2.4.1  Simple  c..  Form 
_ bJ _ 

Let  c.  .  =  cost  of  assuming  that  matrix  i  is  being  used  when  in 

effect  j  is  being  utilized. 

We  have  aQ  x  Q  matrix  C  =  (c„). 

Let  C  (k)  =  cost  of  assuming  that  matrix  KP  is  bei.ig  used. 

Q 

Then  E  [  C  (k)  ]  =  ^  n  j 
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To  minimize  the  expected  cost  we  select  k  as  follows: 


min 

1  <k  <Q 


Q 

Zv 

j  =  i 


kj 


(2  3) 


Numerical  Example 


For  the  mouse  game  Bob  has  decided  to  make  things  interesting 
by  introducing  a  monetary  aspect.  Ray  is  forced  to  play  and  is  confronted 
with  the  following  cost  matrix 

'0  5 

16  0 


this  says,  for  example,  if  he  guesses  that  mouse  no.  2  is  being  used 

when  in  reality  no.  1  is  in  the  cages,  Ray  must  pay  $16  for  his  error. 

It  is  reasonable  to  have  c..  =  0  since  c..  is  the  penalty  associated  with 

u  n 

making  a  correct  decision.  Using  the  data  of  section  2.  2.  1  we  had 
a  '  =  [0.8,  0.2]  and  after  the  set  of  transitions  with  frequency  count 


F  = 


2 

0 


we  had 


a "  =  [  0. 137,  0. 863] 

With  a  '  =  [0.8,  0.2] 

E  [  C(l)]  =0  8(0)  +  0.  2(5)  =  1  0 

E  [  C  (2)]  =  0. 8(16)  +  0.2(0)  =  12  8>1.0 
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Therefore,  prior  to  the  observations  Ray  would  have  aaid  that  mouse 
no.  1  was  being  used,  with  an  expected  cost  of  1.0.  With 

a  "  =  [0.137  0. 863] 

E[G(1)]  =  .  137  (0}  +  .  863  (5)  =  4.  315  and 
E  [G.(2)]  =  .  137  (16)  +  .  863  (0)  =  2.  192  <  4.  315 

Therefore,  after  the  observations,  even  though  c-_  »  c.  ,  he  would 

I U  1  L  Cl 

say  that  mouse  no.  2  is  in  the  cages,  with  an  expected  cost  cf  2.  192. 

It  is  seen  that  the  small  number  of  transitions  has  completely  altered 
the  decision  and  associated  expected  cost. 

2.  A.  2  The  Expected  Net  Profit  per  Transition 

Here  we  assume  that  the  important  cost  element  is  the  expected 
net  profit  per  transition.  This  quantity  and  its  importance  in  statistical 
decision  theory  will  be  discussed  at  length  in  Chapter  5. 

Let  r.  .  be  the  reward  for  each  transition  from  state  i  to  state  i 
ij  J 

(i,  j  =  1,  2,  .  .  .  ,  N)  and  let  c  be  the  cost  per  transition  for  using  the  process. 

Then  the  expected  reward  per  transition  given  that  matrix  P  is  being 

used  is  R  where 

I  S  kpijrij . l2-4) 

j  j 

with  tt  .  being  the  steady  state  probability  of  being  in  state  i  given  that 

ko  •  • 

P  is  in  use . 

With  the  probability  distribution,  a  pv^j  the  possible  matrices 
the  expected  reward  per  transition,  E(R),  is  given  by 
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Q  Q  N  N 

E(R,=V  %  fcR  ..  >  %  I  \r.. . ,2.5) 

k=l  k=l  i  =  l  j=l 

and  the  expected  net  revenue  per  transition  is 
E  (R)  -  c 


Numerical  Example 

In  the  mouse  problem  (section  2.  2.  1)  Ray  is  now  confronted  with 
a  different  cost  structure  where  he  may  have  a  chance  of  making  some 
money.  He  must  pay  $  3.  60  per  time  period  to  have  a  mouse  in  the 
cages  but  he  is  rewarded  for  mouse  transitions  as  follows: 

0  8 

(in  dollars) 

3  2_ 

where  r„  is  the  reward  for  a  mouse  transition  from  cage  i  to  cage  j. 
Should  he  now  accept  Bob's  offer? 


c  =  3.  6 

Using  the  data  of  section  2.2.1  we  had  a  '  =  [  0 . 8  0.2]  and  after  the  set 
of  transitions  with  frequency  count 


we  had  ci"  =  [  0 . 1 37  0.863] 
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From  equation  (2.  4) 


L  1  ,1 

R  1  <  Pllrll  + 


lp12r12)+  1?r2(lp21r21+  lp22r22} 


=  Y^(0  +  7.2)  +  ^(2.1  +  0.6) 
=  $  4.  67 


Similarly  R  =  3.  11.  Then  using  equation  (2.5) 


for  a. '  =  [  0.  8,  0.  2]  E  (R)  =  4.  36 

and  E  (N.R. )  =  4.  36  -  3.  6  =  0 . 76  >  0. 

For  a 'i[  0.  137,  0.863]  E(R)  =  3.32 

and  E  (N.R.)  =  3.  32  -  3.6  =  -0.28  <  0. 

Hence,  if  Ray  considers  it  worthwhile  to  play  the  game  only  when 
E  (N.  R. )  >0,  then  it  is  clear  that  before  the  observations  he  would  have 
been  willing  to  play,  but  such  is  not  the  case  afterwards. 

2.4.3  A  Third  Possible  Cost  Structure 

For  the  reader  who  is  familiar  with  Howard's  policy  iteration 
10 

problems  it  probably  has  become  apparent  that  the  policy  iteration 
model  could  be  generalized  to  the  situation  where  instead  of  knowing 
the  transition  probabilities  exactly  there  is  a  multi-matrix  framework. 
Hence,  we  have  a  third  possible  cost  structure  for  the  multi-matrix 
process . 


10  Howard,  R.  A.,  Dynamic  Programming  and  Markov  Processes, 
Technology  Press  and  Wiley,  1960. 
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The  analysis  of  the  process  within  this  cost  structure  appears 
to  be  quite  formidable.  It  is  hoped  that  successful  research  will  be 
completed  in  this  area  in  the  near  futurej  as  inclusion  of  uncertainty 
in  the  transition  matrices  of  the  policy  iteration  problem  would  be  a 
major  step  forward  in  the  modeling  of  Markov  decision  processes. 

2.  5  A  Decision  Problem  Involving  a  Multi-Matrix  Markov  Process 

As  was  mentioned  at  the  start  of  this  chapter  the  primary  purpose 
of  this  section  is  to  introduce  the  reader  to  the  use  of  statistical  decision 
theory  in  multi-matrix  Markov  processes  by  considering  in  detail,  a 
specific  decision  problem. 

2.5.1  Statement  of  the  Problem 

Consider  a  multi-matrix  Markov  process  having  Q  possible  matrices 

*P,  ^P,  .  .  .  ,  ^P.  Suppose  that  the  decision  maker  is  faced  with  the  cost 

framework  discussed  in  section  2.  4.  1,  namely  c..  is  the  cost  of  assuming 

ij 

that  matrix  lP  is  being  used  when,  in  effect,  ^P  is  being  utilized.  Also 

let  d  be  the  cost  of  observing  a  transition  from  state  r  to  state  s.  Then 
rs 

the  following  general  problem,  as  will  be  shown,  can  be  formulated  as  a 
dynamic  programming  problem. 

The  process  is  in  state  i  at  present  and  we  know  the  vector 

a  '  =  (a  ',  a  ',...,  a.-.')  where  a  1  is  the  probability  that  P  is 
—  1  C  U  K 

governing  the  transitions.  During  the  next  n  periods  we  have  the  following 
options : 

i)  State  that  a  particular  matrix  is  being  used  and  pay  the 
resulting  cost  -  this  ends  the  decision  process, 
ii)  Observe  the  next  transition  paying  the  cost  of  the  observation. 
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2.5.2  Dynamic  Programming  Framework  of  the  Problem 


Let  (a_\  i)  =  the  expected  cost  if  an  optimal  policy  is  followed 
and  we  are  in  state  i  with  probability  vector  a  1  over  the  matrices  and 
there  are  h  decision  periods  left.  Then  as  shown  in  section  2.4.  1 
(equation  2.3) 


vo  (o'.i)  = 


min 

1<L<Q 


Q 


m=l 


(independent  of  i).  .  .(2.6) 


Also 


U 

.  .  . .  C  _  min  )  a  1  c.  =  v  (a1,  i) 

( a  ,  i)  =  min  Stop  1<k<Q  Lt  m  km  o  — 

J  ~  m  =  1  l<h  <n,  .  .  (2.7) 


S,  L 


l  ^  K  l} 

j=l  k=l 


au'kp-  •  f  d-  +  vi,  i  "«  j)] 

k  H  it  h-1  —  J 


Where  a  "  is  obtained  through  the  use  of  Bayes  1  rule  (see  equation  2.1). 

Theoretically  the  above  recurrence  relation  and  boundary 
conditions  give  us  a  solution  to  the  problem.  Unfortunately,  the  state 
vector  is  Q -dimensional  for  a  process  having  Q  possible  matrices.  Also 
Q  -  1  of  the  dimensions  are  continuous  rather  than  discrete  variables. 
However,  when  Q  =  2,  even  for  a  large  number  of  states  the  computations 
can  be  carried  out.  An  example  with  2  states  will  now  be  presented. 


2.5.3  A  2  -  Dimensional  Numerical  Example 
Again  consider  Bob  and  Ray's  mouse  game. 


0.  1 

0.9 

0.5 

0.5 

2P  = 

(mouse 
no .  1) 

0.7 

0.3 

(mouse 
no .  2  ) 

0. 4 

0.6 
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Also 


C 


0 

16 


5 

0 


and  D  =  (d  )  = 
rs 


0.  5  1 

0.3  0 


where  c.  .  =  the  cost  of  stating  that  mouse  no.  i  is  in  use  when  mouse  no.  i 
ij 

island  d^g  =  the  cost  of  observing  a  transition  from  cage  r  to  cage  s. 

Assume  that  a  mouse  is  presently  in  cage  1  and  that  Ray's  decision 
as  to  which  mouse  is  being  used  can  be  made  now,  after  the  next,  the  next 
two,  or  the  next  three  transitions  (i.e.  n  =  3) .  Also  assume  that 
a  =  (0.  3,  0.  7)  and  this  includes  the  knowledge  of  the  starting  cage. 

As  there  are  only  two  possible  matrices  (Q  =  2)  we  can  replace 
a  by  a  ,  the  probability  that  mouse  no.  1  is  in  use.  Then  from  equation 
(2.6) 


v  (a  ',  i) 
o 


min  [  a'c.  , 
k.1,2  kI 


+  U-O  ck2] 


min 

1,2 


a  '(0)  +  (1-a  ')  5  =  5  -  5a  ' 
a '(16)  +  (1-a  ')0  =  16a' 


This  says  that,  if  no  more  observations  are  possible,  Ray  selects 
mouse  no.  1 

if  5  -  5  a  '  <  16  a  ' 


i.  e  if 


and  chooses  mouse  no. 


2 


if  a  '< 


_5_ 
21  ' 


Also 


When  a  '  = 


_5 

21 


the  choice  is  immaterial. 


v  (a  ',  i)  =  min  (5  -  5  a  ',  16  a')  =  Ray's  expected  cost  if  no 
o 

observations  remain  and  the  probability  that  mouse  no.  1  is  being  used  is  a  '. 
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Now  from  equation  (2.  7) 


vh(a-,i)  = 


min 
S,  L 


L 


min  (5  -5a  ',  16a1) 

2 

yVp.  a  '+  2p.  (1  -a  ')]  [  d.  +v  (a  ",  r)]  .  .  .  (2.  8) 
lr  lr  J  it  h-1 

r  =  1 


Using  equation  (2.  1)  a  "  can  be  expressed  in  terms  of  a  1  as  follows: 


II 

a 


P.  a 
lr 


*p.  a  '  +  2p,  (1  -a  ') 
lr  lr 


(2-9) 


As  far  as  the  variable  a  '  is  concerned  it  is  apparent  that  its  range 
(for  any  specific  number  of  observations  left,  h,  and  present  cage,  i,  will, 
ha  split  into  3  sections  for  decision  purposes: 


Say  Say 

Mouse  Look  Kfouse 

no.  2  1  Again  ,  no.  1 

0  t  b  1 

a' 
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From  equations  (2.  8)  and  (2.  9)  a  is  where 
2 

16a 

r  =  1 


'  1  [Ipir 


1 

a  +  S.tU-a)]  [d.r  +  vh_i(T 


.  r)] 


p.  a+  p.  (1  -  a) 
lr  it 


=  Jh(a)  . (2.10) 

and  "b"  is  where 

5  -  5b  =  J,  (b)  . (2.  11) 

n 

For  h  =  i ,  i  = 1 


Using  equation  (2.  10)  "a"  is  where 

16a  =  [  0.5 -0.4a]  [0.5 +vq  (  ■■■-  *)]  +  [  0.  5  +  0 .  4a]  [  i  + 

(  0  9a 

Vo'  0.5  +  0.  4a 

This  equation  is  solved  for  "a"  using  the  expression  obtained  for  v  (a i) 
earlier.  The  solution  is  a  =  0.  194 

Similarly  equation  (2.  11)  gives 


5  - 5b  =  [0.5-0.  4b]  [  0.  5+  vp(  - -f  ,  U]  +  [  0 . 5  +  0  4b]  [  1  + 


v 

o 


0.  9b 

0.  5  +  0.  4b  ’ 


2)] 


which  solves  for  b  =  0. 40  7. 


For  0.  194<  a  '  <  0  407 
0.1a' 


0.5-  0.4a' 


is  always  less  than  5/21 


V  ( 


0  1  Q  ' 


Also 


o  0 . 5  -0.4  q 
0.9a' 


T  .  i)  =  16  ( 


0.  1  a  ' 


0 .  5  +  0 . 4  a 


0  5  -0  4ar*  throughout  the  range. 
7  is  always  greater  than  5/21 


v  ( 


0. 9  a  ' 


-  ,  i)  -  5  -  5  f- 


0.9a'  . 


o  '  0 . 5  +  0 . 4  a  1  ’  *'  '0.5+0. 4a  1  >  throughout  the  ran8e- 

Consequently  for  0.  194  S.  a1  s?.  0.407,  from  equation  (2.8) 


0.1a1 


Vj  (a  \  1)  =  £  0. 5  -  0.  4a  ']  [  0.  5  +  16  ( —  Q )  1  +  [  0.  5 +0.  4a  ']  [  1  + 
0.9a' 


5  -  5  ( 


0 .  5  +  0 . 4a 


>] 


v,  (a  1)  =  8.  25  -  0.  70  a  ' 

A 

Note  the  cancellation  of  the  a'  terms  occurring  in  the  denominator. 
Fortunately  this  always  occurs  so  that  it  will  be  found  that  v^(a  i)  is 
a  piecewise  linear  function  of  a'. 

We  can  summarize  the  h  =  l,  i  =  l  situation  as  follows: 


a  range 

P  ^a  1  ==  .  194 
.  194  <a  '  <  .  407 
.  407  <  a  '  <1 


Decision 

Stop  and  say  mouse  no.  2 

Observe  process 

Stop  and  say  mouse  no.  1 


Vjta'.  1) 

16a' 

3. 25  -  0. 70  a  ' 
5  -  5  a  ' 


(See  diagram  on  next  page) 
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h  =  1,  i  =  2 


In  exactly  the  same  manner  as  above  we  find 


a  1  range 


Decision 


Vj  (a'.  2) 


0  <a  1  £  .  161 


Stop  and  say  mouse  no.  2 


16  a  ' 


.  161  r£a'  <•  365 


Observe  process 


2.  12  +  2.  8« 


.  365  <a  '  51 


Stop  and  say  mouse  no.  1 


5  -  5  a  ' 
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h_=?v__i__=_l_ 

Equations  (2.  10)  and  (2.  11)  together  with  the  above  expressions  for 
Vj(a,  1)  andvj(a,  2)  allow  us  to  obtain 

a'  range  Decision  v^(a  1) 

0  £  a  1  ^  .  168  Stop  and  say  mouse  no.  2  16  a' 

.168<a'^.242  Observe  process  1.81  +  5.  249  a'  7 

f  # 

.  242  < a  '  £  .  40 7  Observe  process  3.  25  -  0.  7  a  '  J 

.  407  ■£. a  '  :S  1  Stop  and  say  mouse  no.  1  5  -  5  a  1 

#  At  a  '  =  .  242,  a  "  =  —  -  ■  ^  Q  — .  =  .  365  and  therefore  v  (a  ",  2) 

changes  functional  form  (see  the  previous  table).  This  in  turn  changes  the 
functional  form  of  v^a  1). 

v2(a',  1) 

4 

2 

0 


16a' 


Say  Say 

Mouse  Observe  Mouse 
no.  2  Process  no.  1 
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In  the  same  manner  as  above  there  results 


a'  range  Decision  v^a 2) 

0  <a  1  £0.  133  Stop  and  say  mouse  no.  2  16  a  1 

0.  1 33  £  a  '  <0.277  Observe  process  1 . 420  +  5.  375  a 

0 .  277  < a  1  £0.  282  Observe  process  2.  692  +  0.806  a 

0 . 282  ^  a  '  ^0 . 483  Observe  process  3. 392  -  1 .  679  a  ' 

0 . 483  £  a  1  ^  1  Stop  and  say  mouse  no .  1  5  -  5a  ' 


Say 
Mouse 
no.  2 


a 


i 


Observe 

Process 


Say 
Mouse 
no.  1 


Now  from  equations  (2.  8)  and  (2.  9) 


v  ( a  1)  =  min 
S,  L, 


S  min  (5  -  5o  1 ,  16a1) 

L[  0.5 -0.4a1]  [0.5  +  v,(-Vl'°'  ,  ■  1>] 


+[  0.5  +  0. 4a']  [  1  +  v^( 


2'  0.5  -  4a  1 
0.  9  a  1 


0 .  5  +  0 . 4  a 


.  2)] 


But  Ray  has  been  told  that  a  =  0.  3 

15  min  (3  5,  4.  8) 

03  27 

(.L  0.38[  0.5  +  v2(-^,  1)]  +0.62[  1+v2(~2-,  2)] 

fs  3.5 

•*.  0  3 

(_L  0.  38[  0.  5  +  1-6  C-^-  )]  -h  0 . 62[  1+3.392- 

l-679<Tf> 

f  S  3.5 

}  L  3.00 

Therefore  he  should  observe  the  first  transition  and  his  expected 
cost  is  3  00.  Actually  the  solution  has  given  him  far  more  than  just  this 
answer.  It  also  shows  the  optimum  policy  to  follow  (with  the  expected 
cost)  for  either  starting  state,  any  a  vector  and  0,  1  or  2  looks  possible. 

2.5.4  Some  Further  Remarks 

From  the  results  of  the  numerical  example  some  fairly  general 
remarks  can  be  made. 


v  (0 . 3,  1 )  =  min 
3  S,  L 


min 
:S,  L 


va*0.3.!)^ 


First,  as  demonstrated  by  induction  in  Appendix  A,  v  (a  ',  i) 
is  always  a  piecewise  linear  function  of  the  components  of  a'.  A  second 
interesting  point  in  the  2  -  matrix  example  is  that  the  range  of  a  in  which 
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we  decide  to  observe  the  next  transition  is  a  monotonic  increasing 
function  of  the  number  of  possible  observations  remaining.  This  is  as 
expected.  Furthermore,  for  more  than  two  possible  matrices  we  would 
expect  the  volume  of  the  region  of  the  Q  -  1  dimensional  space  of  the 
components  of  a  in  which  we  decide  to  observe  the  next  transition  to  be 
a  monotonic  increasing  function  of  the  number  of  possible  observations 
left  (a  monotonic  increasing  function  is  assumed  to  include  the  trivial 
case  where  the  function  is  zero  for  all  values  of  the  argument). 

It  is  clear  from  the  dynamic  programming  form  of  the  problem 
that  the  number  of  states  is  not  really  critical.  There  is  one  state 
variable  for  the  state  occupied.  Increasing  the  number  of  states  merely 
increases  the  range  of  this  one  state  variable.  On  the  other  hand 
introduction  of  an  extra  possible  matrix  increases  the  number  of  state 
variables  by  one.  Hence,  the  number  of  possible  matrices  is  the 
quantity  that  governs  the  feasibility  of  the  dynamic  programming 
solution. 

Finally,  it  can  be  stated  that  the  numerical  example  has  certainly 
demonstrated  the  possibility  of  statistical  decision  making  for  a  multi¬ 
matrix  Markov  process. 
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CHAPTER  3 


THE  MULTIDIMENSIONAL  BETA  DISTRIBUTION 

As  was  stated  earlier,  in  this  and  the  following  two  chapters  we 
shall  assume  directly  that  the  transition  probabilities  themselves  are 
random  variables .  The  present  chapter  is  concerned  with  the  multi¬ 
dimensional  Beta  distribution,  a  most  convenient  distribution  to  place 
over  the  transition  probabilities  of  a  single  row  of  the  transition  matrix. 

3.  1 _ Conjugate  Prior  for  a  Multinomial  Distribution 

Consider  a  random  variable  that  follows  a  multinomial  distribution 
of  order  k,  i.e.,  each  time  a  draw  is  made  it  can  fall  into  1  of  k  categories. 

Let  p.  =  probability  that  a  particular  draw  will  fall  in  the 
ith  category  (i  =  1,2,...,  k) 
k 


Let  E  =  event  that  in  n  independent  draws  n  fall  in  the 
i*1*1  category  (i  =  1 ,  2,  . .  .  ,  k) 
k 


V  „.  =  » 


i  =  1 

Then  using  the  basic  property  of  the  multinomial  distribution 


pr(El  pr  p2' . pk)=— 


nl  n2 

>1  P2  ■••Pk 


II  n 


(3.  1) 


kernel 
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Now  following  the  suggestion  of  Raiffa  and  Schlaifer  if  we 

want  to  place  a  prior  distribution  over  the  p/s,  it  is  advisable  to  select 

the  prior  such  that  it  has  the  same  kernel  as  the  likelihood  function  of 

1 2 

equation  (3  1).  Hence  the  conjugate  prior  should  be  of  the  form 


pl* P2’ 


m^-1  rr^-l  m  -1 

p  (xrx2”  •  ■  ,xk)  =C  X1.  x2  •••xk  .•••(3.2) 

K 


where 


1  and  x.  >  0 
i 


i=  1 


and  as  shown  in  Appendix  B  the  proper  normalizing  constant  to  use  is 

1 


C  = 


r(rn  +  m  + .  .  .  +m,  ) 
12  k 


P(m1 ,  m2 . mk) 


rfm^  r(m2)-  • 


.  .  ..  (3.3) 


(3.2)  and  (3.  3)  together  define  the  multidimensional  Beta  distribution  with 
parameters  (m^,  m^,  ....  m^). 


3.  2  Important  Properties  of  the  Distribution 

Consider  a  multidimensional  Beta  distribution. 

1 


P1'P2 


•  •  •  .  P/Xl' X2’  '  '  ’  ’  Xk^  P(m  ,  m  ,  ....  m,  )  ~1 


...  xk 


r  2’ 

mk-a 


-1  m^  -1 
X’  X2 


(3.4) 


11  Raiffa,  H.  ,  and  Schlaifer,  R.  ,  Applied  Statistical  Decision  Theory, 
Graduate  School  of  Business  Administration,  Harvard  University, 
1961,  Chapter  3. 


12  Throughout  this  study  f  (w,  x’  •  •  •  >  z)  will  represent  the  joint 

density  function  of  the  random  variables  a,  b, .  .  .  ,  d  evaluated  at  the 
point  (w,  x,  ....  z),  and  f  i  ^(wj  x)  will  represent  the  conditional  den¬ 


sity  function  of  the  random  variable  "a"  evaluated  at  the  point 
that  the  random  variable  llb"has  taken  on  the  value  "x". 


n  ii 

w  given 
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where 


k 

^  x^  =  1  and  x.  >  0 
i  =  1 

In  more  compact  notation 

Pj,  P2>  .  .  .  ,  Pk'  1  2  k'  |3  1,  2,  k1  1  2’  k' 

Then  in  Appendix  B  the  following  properties  are  established: 

i)  The  distribution  integrates  to  one,  i.e.  ,  we  have  properly 
chosen  the  normalizing  co  istant. 

ii)  The  marginal  distribution  of  a  specific  p.  is  given  by 

m.  -1  .  X  m.  -  1 

"  1  (1-x)  J  1  0<,<l 

=  f„(x|  m.,  X  m.)  . (3-5) 

P  J  i*j 

iii)  The  expected  value  df  jk  is 
m 

E(p.)  = — j— ^ .  . <3‘6> 

J  & 

T.  m- 

it  i  1 

iv)  The  variance  of  p.  is 
m .  (,  2  m  ) 

v  =  J - - 1 .  . (3.7) 

j  (Em.)  (1  +  X  m.) 

l  i 

v)  The  covariance  of  p.  and  p  is 

J  ii 

-  m,  rn 

j  *  u  Cov  (p.,  p  )  =  - J  U . . (3.  8) 

J  U  (Xm.)  (1  +  Xm.) 
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Determination  of  the  A  Priori  Parameters 


3.  3 

Ideally  one  would  like  to  select  the  parameters  of  the  multidimen¬ 
sional  Beta  distribution  in  such  a  way  that  the  prior  estimates  of  the  means 
and  variances  of  the  k  individual  Pj's  are  satisfied.*  However,  this  would 

require  Ek-1  parameters  (since  2p.  =1,  if  k-1  means  are  assigned,  the 
th 

k  one  is  automatically  determined)  and  we  have  only  k  parameters  at  our 
disposal  Considerable  effort  was  made  to  find  a  suitable  prior  distribution 
that  possessed  2k- 1  parameters;  suitable  in  the  sense  that  it  would  allow 
easy  Bayes  modification  and  would  be  analytically  tractable.  Unfortunately 
all  efforts  were  unsuccessful.  Therefore,  we  restrict  attention  to  the 
multidimensional  Beta  prior  distribution  (with  only  k  degrees  of  freedom) 
which  allows  easy  Bayes  modification  but  is  not  ideal  due  to  its  deficiency 
in  total  parameters.  We  proceed  as  follows.  Select  the  parameters  such 
that  the  prior  mean  values  of  the  p.'s,  i.  e.  the  E(p.)'s,  are  satisfied 
exactly.  This  uses  up  k-1  of  the  parameters,  leaving  only  1  other  degree 

of  freedom.  Obtain  the  final  parameter  by  a  least  squares  fit  to  the  prior 

v  « 

variances  of  the  p.  s,  i.e.to  the  p.  s  With  these  conditions  in  mind  in 
J  J 

Appendix  C  the  following  expressions  for  the  m's  in  terms  of  the  E(p.)'s 
v 

and  p.  s  are  developed: 

k 

Define  Mc  s  ~  the  value  of.E^m.  obtained  by  least  squares 

t  ?!  I  E(Pj) ]  2[  l"E(P:)]  2 

Then  Ml.s.  -  — -  -1 . (3.9) 

2  P-E(p  )[  l-E(p  )] 
j  “  1  J  J  J 

and  m.  -  M,  E(p.) . (3.10) 

J  cs-  J 

*Estimating  the  prior  means  and  variances  is  not  a  v r ivial  task.  Also, 
we  do  not  mean  to  imply  that  this  is  the  only  method  for  assigning 
values  to  the  k  parameters. 
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Numerical  Example  -  4  category  multidimensional  Beta. 


E(Pj)  =  0.1 

p  =  0.  004 

E(p2)  =0.2 

P2  =  0. 008 

E(p3)  =0.3 

p3  =  0.001 

E(p4)  =  0.4 

p  .  =  0.004 
4 

Using  equation  (3.  9)  we  find  M  L-5<  =  47  (to  nearest  integer) 
Then  from  equation  (3 .  1 0)  m ^  #5,  m^  =  9,  =  1  4,  m^  £19 


_  1  _  4  8  13  18 

P(5,  9,  14,19)  X1  X2  X3  (  •xrx2‘x3) 


(x.  ,  X,.  X.)  = 


5 


!  1  and  x  SO 
m  i  i 

i  =  1 


3.  4  Bayes  Modification  of  the  Distribution 

As  in  section  3.  1  consider  a  multinomial  distribution  of  order  k 

with  parameters  p  ,  p  ,  •  •  -,p,  .  Suppose  the  p.'s  are  a  priori  jointly 
1  c  K  j 

o 

distributed  according  to  the  multidimensional  Beta  distribution. 


Pr  P2 - piJxl  ■  x2>  •  •  •  ■  xk)  =  fp  <xj>  x2 . xiJ  mi  ,m2 . mk)- 

Again,  as  in  section  3.  1,  let  E  be  the  event  that  in  n  independent 

draws  from  the  multinomial  n.  fall  in  the  i*^  category  (i  =  1,2 . k) 

k  _ 

l  "i  =  n 

i  =  1 
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Then  it  is  shown  in  Appendix  D  that  the  a  posteriori  distribution  of  the 
p  s  is  again  a  multidimensional  Beta,  only  the  parameters  are  modified; 

more  precisely 


3.  5 


1 ’ F2’ ' ' 


E(xrx2 - xk)=  Vxrx2 . XlJ  mi+ni'm2  + 


■’VV 


Development  of  the  Multidimensional  Beta  Priors  for  a  Specific 
N  -  State  Markov  Process 


Consider  state  i  of  an  N  -  state  Markov  process.  If  we  are  in  state 
i  f.  times,  then  the  numbers  of  transitions  from  state  i  to  state  i ,  f. . 

i  ij 

(j  =  1  ,  2,  .  .  ,  N),  are  multinomially  distributed  as  follows: 


f  ! 

prf  fil’  fi2’  fiN  I  V  =  f  f  ' .  .  .f  ! 

ll  i2  ltf 


f. 

IN 


if  Pjj  =  Pr(  state  at  time  n  +  1  is  j  |  state  at  time  n  is  i} 


and 


Yf..  =  f 

6  D  1 


This  multinomia1.  behavior  is  the  motivation  for  using  an  a  priori 
multidimensional  Beta  distribution  over  the  transition  probabilities 
p. ,  ,  p._,  •  ,  p  Similar  reasoning  would  lead  us  to  use  multidimen- 

sional  Beta  distributions  over  the  transition  probabilities  for  each  of  the 
other  N  -  1  states  Hence  we  utilize  N  distributions  of  the  form 


Pil,Pi2 


(x  , . x  ,  . 

PiM  11  12 


iW 


)=yxii*xi2'--*xJ 


m. , ,  m  ,  . 
i  1  i2 


’  miN  ^ 
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To  determine  the  parameters  (m_'s)  we  consider  each  row 
separately  and  use  our  prior  estimates  of  the  individual  E(p.  )'s  and 

p.  .  s  as  outlined  in  section  3.  3. 
ij 

Now  because  of  the  simple  Bayes  modification  of  the  multi¬ 
dimensional  Beta  prior  on  the  probabilities  of  a  multinomial  distribution 
(shown  in  section  3.  4}  we  know  that  the  separate  multidimensional  Beta 
priors  of  a  Markov  process  will  be  simply  modified  as  a  result  of  a 
series  of  observed  transitions.  More  precisely,  the  resultant  posterior 
distributions  will  be  new  multidimensional  Beta  distributions  having 

parameters  (m.  .*{. .),  if  (m.  .)  are  the  a  priori  parameters  and  (f. .)  are 
iJ  iJ  ij  ij 

the  number  of  observed  transitions  from  state  i  to  state  j. 

3.  6  Random  Sampling  from  the  Distribution 


A  simple  method  of  randomly  sampling  from  an  analytic  distribution 
is  useful  for  simulation  purposes,  both  when  we  use  the  simulation  to 
empirically  determine  simple  functions  of  the  distribution  (precisely  what 
is  done  in  section  4.  1. 6)  and  also  when  the  random  variables  concerned  are 
but  a  small  part  of  a  complex  system  being  simulated.  Hence  two  methods 
of  randomly  sampling  from  the  multidimensional  Beta  distribution  will 
be  described. 

Method  1  -  Use  of  Marginal  and  Conditional  Distributions 
Consider  the  multidimensional  Beta  distribution 


P1,P2’ 


(x^  x2>  .  .  .  ,  xk)  =fg(x1-x2>  •  •  •  -  xkl  mi’  m2 . mk) 


Then  as  shown  in  Appendix  E  the  following  is  a  method  of  sampling  from 
this  distribution: 
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i) 

ii) 


Draw  V  ^from  the  simple  Beta  f^(  | 

Draw  w  ^  from  the  simple  Beta  fpfw^l 


k 


m, , 

E 

_  m. 

1 

i  = 

k 

2  i 

m0, 

E 

m 

2 

i  v 

3  1 

k-i) 


w  ,  from  the  simple  Beta  f.(w  ,  , 

k-1  pk-1 


mk-rmk) 


Then  xL  =  w 

x  =  w  (1  -W  ) 

2  2 '  r 

X3  =  W3  ^1_  Wl^ 


and 


k-1 


wk-l(1-wk-2> . (1-W2,(1‘W1) 


k  -  1 
E  x. 
i  =  1  1 


Hence,  we  have  reduced  the  problem  of  randomly  drawing  from  a 
k -dimensional  Beta  distribution  to  one  of  taking  a  single  draw  from 
each  of  k  -  1  different  simple  Beta  distributions.  Methods  for  drawing 
from  a  simple  Beta  distribution  have  been  discussed  iri  the  literature 


13  See,  for  example,  Galliher,  H.  P.  ,  "Simulation  of  Random  Processes", 
Notes  on  Operations  Research  1959,  The  Technology  Press,  Cambridge, 
1959  p  238 
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Method  2  -  Use  of  Ratio  of  Gamma  Distributions 


Mosimann  has  shown  the  following  interesting  result: 

If  t.  (i  =  1, 2 . k)  are  independent  random  variables  havingGamm; 

distributions  with  parameters  m.  respectively  and  all  with  the  same  scale 
parameter  q,  i.  e  . 


i  m .  - 1  -  qy . 

f  (y.)=f  <y.  |m.,q)=  5=77 - ry.  1  e 

t.  1  y  1  1  a  (m.)  '  1 


0<y  <  00 
1 


then  the  random  variables 


p.  (i  =  1 ,  .  .  .  ,  k)  (where  p.  = 


k 

E  t. 
i  =  1  1 


have  a  multidimensional  Beta  distribution  with  parameters  m^,  .  .  .  m^,  i.  e. 


f  (x.,x  . 3t)sf  (x  ,x, . x.  m  ,m  ). 

P1’P2’''  > pk  1  2  k  p  1  2  k  1  2  k 

Therefore  to  randomly  draw  from  f^(x^  >  X£>  •  •  •  x^  I  •  m2’  '  '  ’  mk^ 

we  can  take  independent  draws  (y.)  from  the  k  simple  Gamma  distributions 
and  then 


x,=-= -  ,  x  =  — -  ,  etc. 

1  Z)y,  2  Ly, 

1  '1 

Methods  for  sampling  from  a  simple  Gamma  distribution  have  been 
developed  (in  fact,  we  require  this  in  method  1  as  sampling  from  a 
simple  Beta  involves  sampling  from  two  Gamma  distributions). 


14.  Op.  Cit. 
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Method  2  appears  superior  to  method  1  as  method  2  requires 
k  draws  from  simple  Gammas  per  set  of  k  -  dimensional  Beta  values 
wlile  method  1  requires  2k -2  draws  for  the  same  output  In  any  event, 
as  mentioned  earlier,  being  able  to  randomly  sample  from  the  multi¬ 
dimensional  Beta  distribution  is  important  for  two  reasons.  First,  it 
allows  us  to  obtain  by  simulation  values  of  functions  of  multidimensional 
Beta  distributions  that  are  not  attainable  by  analytic  means.  Secondly 
it  permits  the  use  of  multidimensional  Beta  variables  in  complex 
simulations  . 
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CHAPTER  4 


EFFECTS  ON  QUANTITIES  OF  INTEREST  OF  HAVING 
MULTIDIMENSIONAL  BETA  DISTRIBUTIONS  OVER  THE 

TRANSITION  PROBABILITIES 


Having  placed  the  transition  probabilities  into  a  framework  where 
our  uncertainty  about  them  can  be  easily  updated  in  the  Bayesian  sense 
(i.  e.  after  observing  some  transitions  the  density  functions  over  the 
transition  probabilities  can  be  simply  modified  according  to  Bayes' 
rule)  we  now  turn  our  attention  to  the  effects  of  uncertainty  in  the 
transition  probabilities  on  various  quantifies  of  interest  in  the  Markov 
model.  For  example,  the  steady  state  probabilities  are  no  longer  exact 
numbers,  rather  they  have  now  become  random  variables  since  they  are 
functions  of  the  transition  probabilities  which  are  random  variables.  It 
is  important,  both  for  statistical  decision  purposes  and  for  interest  in 
the  quantities  per  se,  that  we  be  able  to  describe  their  behavior  when  the 
transition  probabilities  are  not  known  exactly. 

4.  1 _ Steady  State  Probabilities 

4,1.1  General  Remarks 

As  mentioned  above  the  steady  state  probabilities  are  functions  of 
the  transition  probabilities  which  are  random  variables,  hence  the  steady 
state  probabilities  now  become  random  variables.  It  will  be  assumed  that 
the  following  mechanism  operates.  Single  values  for  each  of  the  transition 
probabilities  are  drawn  from  their  distributions.  With  these  now  exactly 
fixed  transition  probabilities  the  system  is  run  to  the  steady  state  pro¬ 
ducing  exact  steady  state  probabilities.  The  whole  process  is  repeated 
many  times  generating  various  values  of  the  steady  state  probabilities. 
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Ideally  we  would  like  to  know  the  actual  density  functions  of  the  steady 
state  probabilities  produced  in  this  way;  however,  this  is  far  more 
easily  said  than  accomplished.  Actually,  as  will  be  explained  in 
Chapter  5,  merely  the  expected  values  of  the  steady  state  probabilities 
will  be  adequate  for  most  decision  purposes. 

For  2  -  state  processes  exact  closed-form  expressions  will  be 
obtained  for  the  expected  values  of  the  steady  state  probabilities.  The 
analytic  difficulties  for  more  than  2  states  are  clearly  illustrated  by 
the  3  -  state  situation. 


Consider  the  transition  matrix 
1  -a-b 

P  =  c  1 -c-d  d 

f  1-e-f 


where  p„  =  pr  {  state  at  time 

n  +  1  =  jj  state  at  time 
n  =  i  } 


Now  solving  r  P  =  7r  or  referring  to  another  study  done  by  the  author 
(see  Appendix  F)  tells  us  that  the  steady  state  probability  of  being  in 
state  1  is  given  by 


ce  +  cf  +  de 

1  ad+ ae  +  af+bc  +  bd+ bf+ ce  +  cf  +  de 
Now  suppose  the  transition  probabilities  are  random  variables.  Then 

F  ,  ,  .  _ ce  4  cf  +  de _ . 

77  1  ad+ae+af+bc+bd+bf+ce+cf+de 
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Unfortunately,  there  does  not  appear  to  be  any  simple  way  Of  evaluating 
the  right  hand  side  for  convenient  distributions  over  the  transition 
probabilities.  It  is  clear  that  the  problem  is  even  more  formidable  when 
the  number  of  states  is  greater  than  three. 

4.  1. 2  A  Special  2-State  Case  Where  We  are  Able  to  Obtain  The 
Density  Functions  of  the  Steady  State  Probabilities 


Consider  the  transition  matrix  B  = 


1  -a  a 

b  1-b 


where  f  (x)  =  1  0  <  x  <  1 

a 

and  f,  (y)  =  1  0  <  y  <  1 

b 


"a"  and  "b"  are  assumed  independent  and  we  know  that  it 

15 

Using  the  theory  of  derived  distributions  we  find  that 


f 

TT 


<z) 

1 


i 


1 

2(1 -4  )  2 
1 


2z 


1 


o  <  :z  <  1/2 

1/2<z<1, 


b 

a-t-  b 


f  (z) 

*1 


This  distribution  has  a  mean 
of  1/2  as  we  would  expect. 


15.  See,  for  example,  Wadsworth,  G.  P.  and  Bryan,  J.G.  , 

Introduction  to  Probability  and  Random  Variables,  McGraw-Hill, 
1959,  Section  6.  12. 
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The  only  reason  that  a  closed  form  can  be  obtained  for  f  (•) 

*1 

is  that  the  extremely  simple  forms  of  f  (•)  and  f  (•)  allow  us  to  per- 

3.  D 

form  the  integrations  involved  in  deriving  the  new  distribution.  The 
distributions  assumed  for  "a"  and  "b"  are  very  special  forms  of  the 
Beta.  For  the  more  general  form  of  the  Beta  the  integrations  cannot  be 
performed.  Now,  if  we  observe  several  transitions  of  the  special  pro¬ 
cess  considered,  Bayes  modification  will  result  in  more  general  Beta 
forms  for  the  distributions  of  "a"  and  ,lb".  Hence,  at  that  stage  we 
would  again  not  be  able  to  obtain  the  density  function  of  .  Thus  it  is 
seen  that  the  special  distributions  (uniform)  assumed  for  "a"  and  "b" 
in  this  section  are  of  very  limited  practical  value  despite  the  fact  that 
they  allow  us  to  obtain  the  density  function  of  n before  any  transitions 
occur.  We  really  want  to  be  able  to  handle  the  more  general  situation 
of  independent  Beta  distributions  over  "a"  and  "b". 

4.1.3.  The  2-State  Case  Where  One  Transition 
Probability  is  Known  Exactly-  While  the 
Other  is  Beta  Distributed 

As  mentioned  earlier  it  will  be  demonstrated  in  Chapter  5  that 
merely  the  expected  values  of  the  steady  state  probabilities  are  adequate 
for  most  decision  purposes.  Therefore,  as  a  start  in  the  right  direction 
we  shall  find  the  expected  values  Pf  the  steady  state  probabilities  for  the 
Markov  process  with  the  following  special  structure: 


P  = 


1  -  a 


1  -  b 


where  "a"  is  assumed  exactly  known  but  "b"  has  the  Beta  distribution 
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fb(x)  =  fp(x  |  m,  n) 

Physical  Application: 

There  is  a  meaningful  physical  application  of  this  model.  Con¬ 
sider  customer  behavior  between  2  brands  of  a  product.  Let  state  1 
represent  that  the  customer  is  buying  our  brand,  state  2  that  of  our 
competitor.  Then,  from  extensive  past  data  on  our  own  customers  it 
is  logical  to  assume  that  we  know  accurately  the  value  of  "a",  the  proba¬ 
bility  that  a  customer  buying  our  brand  at  time  n  will  switch  to  the 
competitor's  brand  on  the  next  purchase.  However,  due  to  lack  of 
records  on  the  competitor's  customers  we  only  have  a  rough  idea  about 
the  value  of  "b".  Hence,  it  is  reasonable  to  place  a  Beta  prior  distribu¬ 
tion  over  "b ". 

Determination  of  E(7t ^): 

For  a  given  (a,b)  pair 
a 

71 Z  =  ITT  ' 


the  steady  state  probability  of  being  in  state  2. 


fb(x)  dx 


■s 


1 


a  +  x  (3(m,  n) 


m-1  .  n-1 

x  (1  -x)  dx. 


This  is  not  an- easy  integral  to  evaluate  in  its  current  form. 
However,  as  shown  in  detail  in  Appendix  G,  use  of  the  hypergeometric 
function,  a  well  known  function  arising  in  certain  physics  problems, 
enables  us  to  obtain  the  following  expression  for  E^^): 
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a 

i  +  n 

1 

n(n+l) 

1  \ 

a  +  1 

m  +  n 

la+lJ 

(m+n)(m+n+l) 

ia+l) 

n(n+l)  ..  .  (n+k-2)  / 1  \k-1]  (4.1) 

(m+n)(m+n+l)  ...  (m+n+k-2)  la+1  I  J 


where 


„  <  n(n-t-l)  .  .  .  (n+k-1)  /  1  \ k 

k  (m+n)(m+n+l)  ...  (m+n+k-1)  |a+l] 


(4.2) 


It  is  clear  that  E  can  be  made  arbitrarily  small  by  choosing  k 

K 

sufficiently  large.  Hence,  we  can  come  arbitrarily  close  to  E(7 r  )  by 
selecting  a  large  enough  k. 

Finally, 

E(jt1)  =  1  -  E(tt2). 


Asymptotic  Check  on  the  E(tt2)  Formula 

In  section  G.  2  of  Appendix  G  an  asymptotic  check  on  equa¬ 
tion  (4.1)  is  performed. 

Monotonic  Behavior  of  2(77^)  as  a  Function  of  m  +  n  for  Fixed  E(b) 

In  section  G.  3  of  Appendix  G  it  is  shown  that  for  fixed  E(b), 
i.  e.  ,  for  a  fixed  mean  of  the  prior  Beta  distribution  on  "b",  E(7r2)  mono- 
tonically  decreases  as  m  +  n  increases;  i.e,  ,  as  the  variance  of  the 
Beta  distribution  decreases  (''b"  becomes  more  and  more  exactly  known) 

Numerical  Example 

Consider  the  physical  application  of  customer  behavior  between  2 
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brands  of  a  product.  Suppose  that  we  know  accurately  that  a  customer 
buying  our  brand  this  period  will  switch  to  the  competitor's  brand  next 
period  with  probability  0.5  (i.e.,  a  =  0.5).  Also  from  our  limited 
knowledge  of  the  competitor's  customers  we  are  willing  to  place  the  fol¬ 
lowing  Beta  distribution  on  "b", 

f^(x)  =  fp(x|  9,  1)  i.  e. ,  m  =  9,  n  =  1 . 

To  obtain  EtTr^)  to  5  significant  figures,  equation  (4.2)  tells 
us  to  use  k  =  7  (a  small  number  of  terms  for  such  a  high  degree  of 
accuracy).  Then  equation  (4.1)  gives 

E(jt  )  =  0.35882, 

£  • 


the  expected  value  of  the  steady  state  probability  that  a  particular  custo¬ 
mer  will  be  buying  from  the  competitor. 

This  is  remarkably  close  to  the  when  "b"  is  exactly  known 
at  0. 9.  In  that  case 


^2^exact 


0.5 


0. 5  +  0.9 


0. 35714. 


Also,  for  this  example  m  =  9,  n  =  1  are  the  smallest  integers  for 
which  E(b)  =0.9.  Therefore,  due  to  the  above-stated  monotonic  behavior 
of  E(7t^)  it  is  seen  that  for  any  combination  of  integers  that  make 


(e.g.,  90  and  10)  E^)  must  lie  in  the  narrow  range 
0.  35714  <  E(t r  )  <  0.  35882 
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i.e.  is  remarkably  insensitive  to  the  variance  of  the  prior  Beta 

distribution  on  "b". 

Numerical  Results 


A  program  for  the  IBM  7090  digital  computer  was  developed 
which  used  equations  (4.1)  and  (4.2)  to  give  accurate  to  5  decimal 

places  for  the  following  9x9x14  or  1134  combinations: 


a  =  0. 1,  0.2,  0.3,  ....  0.9 

b  =  0. 1,  0.  2,  0.  3 . 0.9 

m  +  n  =  10.  20,  30,  40,  50,  60, 


80  100,  150,  200, 


300,  500,  1000,  «3. 


Appendix  H  presents  portions  of  the  tabulated  results  (the  entire  tabu¬ 
lation  would  have  required  a  prohibitive  amount  of  typing;  also  graphical 
representation  would  have  required  81  separate  figures).  Analysis  of 
these  results  shows  that(for  a  fixed  b;  E^^)  is  extremely  insensitive  to 
m  +  n;  i.e.,  to  the  variance  of  the  Beta  distribution.  Hence,  we  can 

approximate  E(tr,)  by  (7t  )  where 

22  exact 

(7 T  )  =  - ^-=- 

2  exact  a  +  b 

(i.e.,  we  assume  "b"  is  exactly  known  at  its  mean  value).  The  approxi¬ 
mation  improves  as  m  4  n  and/or  "a"  and/or  b  increase  but  it  is  even 
quite  good  for  low  m  +  n,  "a  "  and  b.  The  fact  that  E(a/(a+b))  deviates 
most  from  a/(a+b)  for  low  "a"  and  b  is  reasonable  because  under  these 
conditions  the  small  mean  value  of  the  denominator  makes  it  very  sensi- 

IL  tl 

tive  to  variations  in  b  . 
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A  typical  plot  of  vs.  m  +  n  for  fixed  "a"  and  b  is 

given  in  Figure  4.  1.  The  alternate  horizontal  scale  was  developed  as 
follows : 


mn  _  b(l  -b) 

(m+n)  ^  (m+n+1)  m  +  n  +  1 


Therefore,  for  fixed  b;  giving  m+n  also  prescribes  b.  For  example, 
if  b  =0.5  and  we  say  that  m  +  n  =  24,  then  we  have  prescribed  the 
variance  b  at  the  value  (0.  5)(0.  5)/(25)  =  .01. 

4,  1,4.  The  2-State  Case  Where  Both  Transition 
Probabilities  are  Independently  Beta 
Distributed 


Again  our  desire  is  to  determine  the  expected  values  of  the 
steady  state  probabilities.  First,  a  complicated  summation  form 
for  E(7t^)  will  be  presented.  Theoretically  it  can  be  used  to  find 
exactly  for  any  integer  values  of  the  4  Beta  parameter J.  However,  to 
study  the  general  behavior  of  E(r^)  it  will  be  convenient  to  make  use  of 
the  results  of  the  last  section. 

Exact  Summation  Expression  for  El^) 


Here 


P  = 


with 


-54- 


fa(x)  =  fptxlnij.nj) 

and 

£b(  y)  =  Vylm2,n2)- 

i  i 

E(7r2)=j>  J  TT7  Vx|mi,nl}  Vy|m2’n2}  dydX 

0  0 


As  shown  in  Appendix  I 


m.,-2  n2-l  m2+n2-2-k 


Ptmj.nj)  P(m2,n2)  E^)  =  ^ 


k=0  j=0  r =0 


m  -11  n+k-l-j 

l  <-» 


•VV2"k 


J  /  m 


- - L - 1 - .  p(m.  +k+r+ 1 ,  n. )  + 

+  n.  -  2-  k-  j  1  1 


2  2 


m  -2  n.-l  m- 1  1  .  n7_M|  :\ 

2  2  j  \  2  I  n  +k-j  \  2  Jl  J 

V  V  Y  \  k  /  (-1) _ \  j  / \  r  ptm^m^n^r-l-j.nj) 

L  L  L  m+n  -2-k-j 

k=0  j=0  r =0 


Y Y  m. ..v-i  V 

+  (-1)  2  \  \  i  J  _  _ \ _ /  p(m  +m  +r,  n  ) 

ft  ft 


»2Y  J  Y1 L,v*-j(j) 

+  (-1)  2  \  ^  2 '  _ i_' P(m1+m2+n2+r'j‘l,n1) 

j=0  r  -0  n2  " ^ 
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+  (-1) 


m^  -1 


-1  nj -1 

I  I 

s=0  t=0 


n„ -1 \/  n„  -1  , 

2  1  ,  ..t*. 


s  \  t 


(-1)  (m^+m^+s+t) 


+  (-1) 


n^-l  n^  -1  . 

m2_1  y  /  n2_1  \  y  (  ni_1 

s=0  '  S  '  kto  k  ^  (m1+m2+S+k)2 


(-1) 


(4.3) 


where 


f 


m^  +m^+s+t 


— t - .  y 

+  m„  +  s  +  t  A 


(-D 


u-1 


1  2 


u=l 


X,  (m^+m^+s+t)  =  / 


rrij  +  m2  +  s  +  t 


In  2  + 


when  m.j  + 

+  s  +  t 

is  even 


nij  +  +  s  +  t 


m^+m^+s+t 


l 

u— 1 


(-D 


when  m^  +  +  s 

+  t  is  odd. 


NOTE: 


-1 

I 

i=0 


means  that  there  are  no  terms. 


Due  to  the  complexity  of  this  expression  it  cannot  be  used  to 
study  the  general  behavior  of  as  the  m's  and  n's  are  allowed  to 

vary. 
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General  Behavior  of  E(tTj)  and  E(7r2) 


Again  consider 


1  1 


m,  -1  n,  -1 


E( 


,  r  r  _ i _  x  i  “i 

^2  J  J  Ptmj.nj)  P(m2>n2)  x  +  y  X  ‘X 


0  0 

m  -1  n  -1 


2  2  C  1  1 

•y  (1-y)  dydx  =  J 


m,  -1  n  -1 

d-x) 


I 


P(m2,  n2)  x  +  y 


m  -1  n  -1 

X  y  (1-y)  dyldx 


Mx) 


NOTATION: 


Let  r  t  as  s  t  signify  that  r  monotonically  increases  as  s 
increases  and  r  |  as  s  t  signify  that  r  monotonically  decreases  as  s 
increases . 


Then  from  section  4.  1.3  we  know  that  for  fixed  b  =  m2/(m2+n2) 


L(x)  =  E(7r?|x)  j  as  m2  +  n^f 


This  holds  for  any  x  between  0  and  1.  From  above 

O 

1 

E( w2)  =  ^  fa(x)  L(x)dx 


But 
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f  (x)  >r  0 
a  ’  ' 

.  E(jt2)  |  as  m2  +  n2  f  , 

and  this  is  independent  of  and  n^.  ’  Therefore,  for  any  Beta  distribu- 
tion  on  "a"  and  a  fixed  b;  ^(Tr^)  f  as  +  n^  f  >  i.  e.  ,  as  b 
decreases.  In  fact,  we  can  now  make  the  following  series  of  statements: 

For  As  Then 

any  Beta  on  "a",  and  fixed  b  m_  +  n  t  j  i.e.,  b|  E(7T  )  t  and 

4  fa  1 

any  Beta  on  "b",  and  fixed  a  m2  +  n^  t  ;  i.  e.  ,  a  j  E(7T^)  |  and 

E<V t 

Suppose  that  both  a  and  b  are  fixed.  Then 
E^)  f  when  b  and/or  a  t 

I  V  ■  y  - 

E(tt^)  |  when  b  *  and/or  a  j 

,  v  *  ,  V  | 

E(7T2)  f  when  b  |  and/or  a  ♦ 


Knowing  "a"  and  "b"  exactly  corresponds  to  a  =  b  =  0.  Let 

the  corresponding  exactly  known  steady  state  probabilities  be  (jt^) 

and  ( 7r  )  .  Then,  due  to  the  above  results,  the  deviations  from  these 

2  ex 
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values  (for  integer  m's  and  n's)  occur  as  follows: 


E(tt1|  °°,  °°,  m2,  n.,)  m^,  n^)  <  (t^)  ^  E^  Ircij,  Oj ,°°) 


or 


00  ,  00 ,  m2>  n2)  <  (7r1  <  ECffjl  nr^ ,  nj ,  n ^  ntij ,  ^  ,°°  ,°°) 


and 


E(7rj  m1  ,  nl  ,oo  ,  oo)  <  E(7r2|  ,  n^  m2>  n ^  :£  (fl^)  —  E(7r2|«>,  °°,  m2>  n2) 


#  * 


or 


E^lrryn?,00,  °°)  S  (7r2)ex  -  mi  ’  nl’ m2' n2^  :£  E(tt2|°°,  °°  rn2'nz) 

jf  * 

where  and  n^  are  the  smallest  integers  for  which 


nij  +  n^ 


is  satisfied. 

Observe  that  these  inequalities  do  not  tell  us  on  which  side 
of  (7r^)ex  E { 7T^ )  falls.  However,  they  do  reveal  the  very  important 
fact  that  the  worst  deviations  occur  in  situations  where  one  transition 
probability  is  known  exactly,  situations  that  we  have  already  closely 
studied  in  section  4.  1.  3  through  the  use  of  the  hypergeometric  function. 
There  we  found  that  replacing  E ( 7r  |  «=,  °°,  m  ,  n  )  by  (n  )  usually 

1  u  u  X  6X 

resulted  in  a  very  small  error.  Now  we  know  that  replacing 
E ( 7T_  |m  ,  n  ,  m  ,  n?)  by  (<r  )  will  produce  the  same  or  an  even  smaller 

1  XX  ££  X  CX 
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error.  Hence,  we  can  accurately  approximate  E(jt  |m  ,  n  ,  m  ,  n  ) 

1  11  4  4 

by  (?r  )  ,  the  latter  being  easy  to  calculate. 

4,1.5.  The  Special  3-State  Case  Where  Only 
1  Transition  Probability  is  Not  Known 
Exactly 


P  = 


1 


-  a  -  b 


c 


e 


a 


b 


1  -  c  -  d  d 


f 


1  -  e  -  f 


Suppose  "a"  is  the  only  probability  not  known  exactly.  Then 
it  is  convenient  to  let  t  =  a/{l  -b)  and  to  assume  that 


m- 1  ,,  ,n-l 
x  (1-x) 


0  <  x  <  1 


Then  Appendix  J  reveals  that  the  expected  values  of  the  steady 
state  probabilities  can  be  expressed  in  terms  of  the  hypergeometric 
function  F  (defined  in  equations  (G--?)  and  (G.  3)  of  Appendix  G). 

E(V  =  F(1,nlm+nl^c) 

E(,r2)  =  BTC  (^)  F(1’nlm+n+1l^c) 

+  B^C  F(1-nh+n 

and 

E(tt3)  =  1  -  E^)  -  E(tt2) 

where 


B 

B+C 
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A  = 


ce  +  cf  +  de 
1  -  b 


B  =  d  +  e  +  f 


C  = 


be  +  bd  +  bf  +  ce  +  cf  +  de 

r~^b 


D  =  e  +  f 


and 


Appendix  G  also  suggests  a  method  for  evaluating  a  hyper  geometric 
function  for  given  arguments ;  hence,  we  can  evaluate  E(7T^),  E(^) 
and  E(7t^)  for  given  values  of  m,  n,  b,  c,  d,  e  and  f. 

The  Beta  distribution  on  a/(l-b)  allows  simple  Bayes  modifi¬ 
cation  of  the  prior.  If  r  transitions  occur  from  state  1  to  state  2  and 
s  from  1  to  1 ,  then  the  posterior  distribution  on  t  is 

ft(x)  =  f  (x|  m+r,  n+s). 

It  certainly  appears  that  the  approach  used  in  this  section  could 
be  extended  to  give  us  the  expected  values  of  the  7r's  in  larger  Markov 
processes  where  only  one  transition  probability  was  not  known  exactly. 

4.  1  ■  6.  Simulation  of  the  Steady  State  Behavior  of 
3-State  Markov  Processes  Whose  Transi¬ 
tion  Probabilities  are  Multidimensional 
Beta  Distributed 


As  discussed  earlier,  for 
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1  -  a  -  b 


P  = 


1  -  c  -  d  d 


f  1 


-  e 


f 


where  a,  b,  c,  d,  e  and  fare  random  variable  s  there  is  no  simple 
analytic  method  of  determining  E(ir  ),  E(ir)  and  E(n  ).  Hence,  it 

L  c,  3 

was  necessary  to  resort  to  simulation  techniques. 


Simulation  Procedure 


Consider  the  3-state  process  having  multidimensional  Beta 
distributed  transition  probabilities  with  parameters  M  =  (m  ).  Using 

ij 

one  of  the  two  techniques  outlined  in  section  3.6,  a  random  draw  of  u 

and  v  was  made  from  f  (u,v)  =  f  (u,v|m,  ,,m,  of  w  and  x 

a,b  p  12  1311 

from  f  (w,  x)  =  f  (w,  x|m  ,  m  ,  m,_),  and  of  y  and  z  from 
c,  d  p  21  23  22  ' 

fe  f^y’  z)  =  fp^y’  2  I  m3i  ’  m32’  m33^‘  Then  we  kn°w  (from  section  4.  1 .  1 ) 

that  the  corresponding  steady  state  probabilities  are 

jj.  _  wy  +  wz  +  xy 

1  ux  +  uy  +  uz  +  vw  +  vx  +  vz  +  wy  +  wz  +  xy 

uy  +  uz  +  vz 

7T  =  - - - 

2  denominator 


and 


This  procedure  was  repeated  a  large  number  of  times  (the  usual 
number  was  700)  using  an  IBM  7090  digital  computer  and  the  sample 
mean  values  of  tt  j ,  tt  2  and  ir  3  approached  E(tt  ^ ),  E(tt  2)  and  E  (tt  3), 
respectively.  The  sample  standard  deviation  was  also  calculated  to 
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obtain  an  idea  of  the  spread  of  the  distribution,  hence  the  significance 
of  a  given  deviation  of  the  sample  mean  from  some  fixed  number.  A 
typical  plot  of  a  sample  mean  and  sample  standard  deviation  of  the  mean 
as  a  function  of  the  sample  number  is  given  in  Figure  4.2.  The  stabili¬ 
zation  of  the  sample  mean  in  this  figure  indicates  that  the  sample  size 
of  700  is  reasonable. 

Parameter  Values  Used  and  Results 

For  each  set  of  parameter  values  the  following  quantities  were 
recorded: 

i)  The  exact  steady  state  probabilities  (71^)  ,  etc.,  correspond¬ 

ing  to  the  transition  probabilities  being  known  exactly  at  their  mean 
values . 

ii)  The  sample  mean  values,  it  y 

iii)  The  sample  standard  deviations  of  the  means,  s— 

and  s—  . 

iv)  95  per  cent  "confidence"  regions  for  each  of  E(7r^) 
and  E(7 r^).  (See  the  discussion  below.) 

e.g.,  7 -1.96s—  S  E(7Tj)  ^  +  1 . 96  s— 

is  the  95  per  cent  "confidence"  region  for  E(?Tj). 

v)  The  total  per  cent  deviations  (T.P.D.)  of  the  sample  means 
from  the  exact  steady  state  values 

T.P.D.  =  100  x  [Id,).,  -  *jl  +  l('2)„  -  *2I  +  K'j)«  -  ’311 

(4.4) 

We  have  two  measures  of  the  difference  between  ( 7T. )  and 

j  ex 

E(7rj): 
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Figure  4.  2:  Typical  Plots  of  tt 2  and  Sit.,  vs  Sample 
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i)  If  the  95  per  cent  "confidence "  region  on  7T.  includes  (jr.)  , 

J  J  ex 

we  are  more  inclined  to  say  that  E(tt.}  is  approximately  equal  to  {n.) 

J  j  ex 

than  if  the  region  does  not  include  (ir.)  .  The  use  of  confidence 

J  ex 

intervals  is  not  being  advocated  as  this  dissertation  is  Bayesian  in 

nature.  Fortunately  for  a  reasonably  flat  prior  distribution  on  E(7t.)  in 

the  neighborhood  of  (ir .)  (this  situation  can  be  assumed  here)  the  a 

J  6X 

posteriori  distribution  on  E(n\)  Is  very  close  to  the  density  function 
of  ir  given  E(fl\);  i.e.,  the  a  posteriori  distribution  of  E}^)  is  approxi¬ 
mately  normally  distributed  with  mean  ir .  and  standard  deviation  s—  . 

'  'j 

Hence,  the  95  per  cent  confidence  region  is  the  central  section  making 

up  95  per  cent  of  the  area  of  the  a  posteriori  density  function  of  E(fl\). 

If  it  includes  (tt.)  ,  we  should  be  more  inclined  to  say  that  E(tt.)  Cl 

J  ex  j 

(tt.)  than  if  it  doesn’t  include  (tt.) 
j  ex  j  ex 

ii)  The  smaller  T.P.D.  is,  the  more  likely  that  (ji\)  is  a 

J  6X 

good  approximation  to  E(7T.). 

The  following  systematic  method  was  used  to  decide  on  a  reason¬ 
able  number  of  (m..)  sets  to  test.  The  m  ,'s  were  split  into  two  cate- 
U  lj 

gories,  namely  Low  and  High.  So  that  two  H's  or  two  L  s  would  not 
necessarily  be  the  same,  a  random  element  was  added  as  follows: 

Each  time  an  L  was  requested,  a  1,  2,  or  3  was  selected, 

each  with  probability  1  / 3 . 

Each  time  an  H  was  requested,  a  7,  8,  9,  10,  11  or  12  was 
selected,  each  with  probability  l/6. 

To  perform  a  reasonable  number  of  tests  which  cover  a  wide 
variety  of  patterns  it  was  decided  to  have  each  diagonal  entry  either  high 
or  low  and  each  pair  of  off-diagonal  elements  in  the  same  row  both 
high  or  both  low.  This  results  in  20  essentially  different  patterns; 
permutations  of  states  give  the  same  pattern,  e.g., 
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Eighteen  other  miscellaneous  experiments  were  performed  as  indicated 
in  the  legend  of  Appendix  K.  They  were  undertaken  to  provide  extra 
points  on  the  plot  of  T.P.D.  (to  be  discussed  shortly). 

The  detailed  results  are  presented  in  Appendix  K.  From  these 
Figure  4.3  was  developed.  It  shows  T.P.D.  plotted  as  a  function  of 
two  parameters,  the  average  diagonal  transition  probability  and  the 
sum  of  the  parameters  of  the  multidimensional  Beta  distribution.  More 
will  be  said  on  this  plot  later. 

After  a  careful  study  of  the  results  of  Appendix  K  there  is  one 
important  point  that  can  be  made .  The  low  tbtal  per  cent  deviations 
throughout  as  well  as  the  large  number  of  times  that  the  95  per  cent 
"confidence"  region  on  E(^)  overlaps  the  corresponding  (7rj)ex  value 
clearly  show  thfe  marked  insensitivity  of  the  expected  values  of  the 
steady  state  probabilities  to  the  variances  of  the  multidimensional  Beta 
distributions.  By  an  analytic  argument  this  fact  has  been  shown  to  hold 
for  the  2-state  situation  in  sections  4.1»  3  and  4.1.4. 

At  first  only  experiments  1-26  were  performed.  Close  study 

of  tbeir  results  revealed  two  interesting  points.  First,  for  fixed  values 

of  the  means  of  the  transition  probabilities  the  deviations  of  the  E(7r.)'s 

from  the  ( 7r .)  's  decrease  as  the  sum  of  the  m..'s  increases.  This  is 
J  ex  ij 

as  expected  because  the  larger  the  sumtof  the  m./s,  the  closer  we  are 
to  exactly  known  transition  probabilities.  Secondly,  the  worst  deviations 
of  the  E(7Tj)'s  from  the  (7rj)ex's  aPPear  to  occur  when  the  diagonal  m..'s 
dominate  the  other  parameter  values;  i.e.,  when  the  means  of  the 
diagonal  transition  probabilities  are  high.  This  is  directly  comparable 
to  the  2-state  situation  where  the  largest  deviations  occurred  for  small  a 
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Figure  4.  3:  Plot  of  T.P.D.  for  3  State  Processes 
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and  b  (the  off-diagonal  elements).  When  the  means  of  the  diagonal 

transition  probabilities  are  high,  all  of  a,  b,  c,  d,  e  and  f  are  small. 

Now  each  steady  state  probability  is  made  up  of  a  numerator  over  a 

denominator.  Each  of  these,  in  turn,  is  the  sum  of  a  series  of  cross- 

products  of  a,  b,  c,  d,  e  and  f.  Hence,  if  a,  b,  c,  d,  e  and  f  are 

small,  the  numerator  and  denominator  will  be  small  and  hence,  relatively 

more  sensitive  to  fluctuations  in  a,  b,  c,  d,  e  and  f  than  when  some  or 

all  of  those  parameters  are  large.  Hence,  it  is  expected  that  E(7r.) 

would  deviate  most  from  (n.)  when  all  of  a,  b,  c,  d,  e  and  f  were 

J  ex 


In  an  effort  to  graphically  portray  these  two  important  points 
the  T.  P.  D.  values  of  experiments  1-26  were  plotted  in  Figure  4.3.  It 
became  apparent  that  other  experiments  would  have  to  be  performed  to 
adequately  cover  the  grid.  Hence,  experiments  27-38  were  conducted. 
Their  results  further  substantiated  the  three  points  mentioned  earlier: 

i)  The  expected  values  of  the  steady  state  probabilities  are 
very  insensitive  to  the  variances  of  the  multidimensional  Beta  distribu¬ 
tions. 


ii)  For  fixed  values  of  the  means  of  the  transition  probabilities 

the  deviations  of  the  E(7T.)'s  from  the  (r.)  's  decrease  as  the  sum  of 

J  J  ex 

the  m..'s  increases, 
ij 

iii)  The  deviations  of  the  E(7T.)'s  from  the  (r.)  's  increase  as 

J  J  ex 

the  diagonal  m..'s  more  and  more  dominate  the  other  parameter  values; 


i.  e.  , 
large. 


as  the  mean  values  of  the  diagonal  transition  probabilities  become 
It  was  hoped  that  a  functional  relationship  between  T.P.D., 
and  the  average  value  of  the  exact  diagonal  transition  probability 


(p  )  could  be  developed.  However,  although  Figure  4.  3  illustrates  the 

ii 

above  points  ii)  and  iii)  qualitatively,  it  is  clear  that  not  even  rough 
iso-T.P.D.  curves  can  be  drawn  on  the  figure.  If  such  curves  could 
have  been  drawn,  for  a  given  Markov  process  with  multidimensional 
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Beta  distributions  over  the  transition  probabilities,  we  could  estimate 

the  T.P.D.  value  (i.e.,  the  inaccuracy  of  using  the  (ir  )  's  for  the 

v-  =  jex 

E(7T.)'s)  by  the  fact  that  we  would  know  )  m. .  and  p  and  would  use  the 
J  Lj  ij  ii 

curves  to  obtain  the  corresponding  T.P.D.  value. 

In  Chapter  5  we  shall  define  the  expected  reward  per  time 
period  in  the  steady  state  as 


N 

E(R)  =  )  r.  E( 7T.) 
Lj  J  J 

j  =  l 


where  N  is  the  number  of  states  and  r.  is  the  reward  per  period  for 
being  in  state  j.  If  we  define 


N 

(R)  =  )  r.  U.)  , 

ex  j  j  ex 

j=l 

ideally  we  would  like  to  know  the  behavior  of  |E(R)  -  (R)  |  as  a  function 

of  the  m..'s.  However,  this  behavior  would  also  be  a  function  of  the 
ij 

r^'s  and  therefore  is  quite  difficult  to  study  because  of  the  large  number 
of  parameters  involved.  Thus,  we  have  compromised  by  studying  the 
behavior  of 


N 

T.P.D.  =  )  |  E(tt  .)  -  (7T.)  I  . 

Li  J  J  ex' 

j=l 

In  summary,  it  appears  that  in  most  3-state  situations  where  the 
transition  probabilities  are  multidimensional  Beta  distributed  a  good 
approximation  to  the  expected  values  of  the  steady  state  probabilities  is 
obtained  by  assuming  that  the  transition  probabilities  are  exactly  known 
at  their  mean  values  and  using  the  corresponding  exact  steady  state 
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probabilities  (quantities  that  are  very  easy  to  evaluate)  as  the  approxi¬ 
mation.  This  approximation  becomes  questionable  when  the  diagonal 
transition  probabilities  are  large  and  the  m./s  are  relatively  low  in 

magnitude.  However,  as  will  be  shown  in  the  4-state  situation,  m  's 

ij 

low  in  magnitude  are  very  unlikely  in  practice  because  any  sort  of  prior 
knowledge  about  the  transition  probabilities  will  give  reasonably  small 
variances  on  the  Betas  which,  in  turn,  will  make  the  m.j's  large. 

4.  1.  7.  Simulation  of  the  Steady  State  Behavior 
of  4-State  Markov  Processes  Whose 
Transition  Probabilities  are  Multi- 
dimensional  Beta  Distributed 


Essentially  the  same  simulation  procedure  as  for  3  states  was 

used  here.  However,  now  we  have  4x4  or  16  m..'s  instead  of  9. 

ij 

Also,  the  expressions  for  the  steady  state  probabilities  for  exactly 
known  transition  probabilities  are  far  more  complex.  It  turns  out  that 
the  denominator  of  each  steady  state  probability  is  the  sum  of  64  triple 
cross  products  of  the  transition  probabilities.  More  will  be  said  on 
this  point  in  Appendix  E. 

Part  of  the  L-H  framework  of  the  3-state  simulation  was  used 
here  but  more  of  the  m,.'s  were  generated  randomly  at  higher  values 
as  indicated  in  the  legend  of  Appendix  M. 

An  analysis  of  the  results  of  Appendix  M  again  reveals  low 
total  per  cent  deviations  throughout  as  well  as  a  large  number  of  times 
that  the  95  per  cent  "confidence"  region  on  E( 7r^)  overlaps  the  corres¬ 
ponding  (7 r.)  .  As  in  the  3-state  case  this  indicates  that  the  expected 

J  6X 

values  of  the  steady  state  probabilities  are  quite  insensitive  to  the  vari¬ 
ances  of  the  Beta  distributions.  This  is  most  encouraging  as  it  suggests 
that  the  same  situation  exists  for  any  size  Markov  process.  Unfortunately, 
simulation  of  processes  with  more  than  4  states  would  take  a  prohibitively 
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long  time.  Even  for  4  states  the  computer  time  was  becoming  appreci¬ 
able;  e.g.,  experiments  52  to  56  required  approximately  5  minutes  on 
the  7090. 

We  again  observe  that  for  fixed  values  of  the  means  of  the 

transition  probabilities  E(fl\)  approaches  (ir  )^  as  the  sum  of  the 

m..'s  increases.  Therefore,  the  approximation  of  using  (it.)  for 
ij  J  ex 

E (tt.)  will  improve  as  the  sum  of  the  m.j's  increases.  Fortunately, 
as  will  now  be  demonstrated,  in  most  physical  situations  the  sum  of  the 
rm.'s  will  be  relatively  large. 

Consider  the  example  presented  in  section  3.  3  of  a  4  category 
multidimensional  Beta, 


E(Pl) 

=  0.1 

g 

P1 

=  0.004 

.  \  (T 

pl 

=  0, 

.063 

e(p2) 

=  0.2 

V 

P2 

=  0.008 

(T 

P2 

=  0. 

.089 

E(p3) 

=  0.3 

V 

P3 

=  0.001 

cr 

P3 

=  0. 

.032 

e<p4) 

=  0.4 

V 

P4 

=  0. 004 

(T 

P4 

=  0. 

.063 

We  found  that  ^ 

)  m.  =  47 
u  J 

with  m^  =  5, 

m2  f  9,  m3  = 

14, 

m4 

m.  values,  which  are  reasonably  large,  correspond  to  quite  large  values 
of  the  standard  deviations  of  the  p.'s.  Inmost  physical  situations  we 
would  know  enough  a  priori  information  about  the  p/s  to  have  smaller 
standard  deviations  than  these.  For  the  above  example,  keeping  the 
tr  's  in  the  same  ratio  we  get  the  following  results  using  equations  (3.9) 

•  Pj 

and  (3. 10): 
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The  (r  values  under  the  0.  75c  and  0.  5cr  columns  are  of  reasonable 

Pi 

size  for  a  physical  problem  and  they  are  seen  to  give  very  high  values 


of 


/  m  .  Some  similar  calculations  were  done  for  other  E(p.)  and  tr 

L  j  j  p. 


values  and  the  results  were  qualitatively  the  same  —  for  reasonable 

standard  deviations  of  the  p.'s  the  /  m.  value  is  quite  large,  hence 

J  Lj  J 

the  approximation  of  using  (  f°r  E(ff.)  is  quite  good.  More  will  be 

said  on  the  approximations  when  they  are  used  for  statistical  decision 
purposes  in  Chapter  5. 


4.  2.  First  Passage  Times 

Let  n  .  be  the  number  of  transitions  to  get  to  state  j  for  the 
ij 

first  time  given  that  the  process  is  in  state  i  before  the  first  transition. 
If  i  =  j,  then  n..  is  also  called  the  recurrence  time  for  state  i. 

j,  ii 

We  shall  only  be  concerned  with  the  mean  first  passage  time 
and  not  with  the  entire  probability  mass  function.  Still,  the  same  dif¬ 
ficulties  as  those  met  in  evaluating  the  expected  values  of  the  steady 
state  probabilities  are  encountered  for  3  or  more  states.  This  is  quickly 
shown  by  recalling  ^  that  the  mean  recurrence  time  and  steady  state 

if3 Howard,  R.  A.  ,  Dynamic  Probabilistic  Systems  (in  preparation). 
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probability  are  related  by  E(n„)  =  l/(7r^)*  Hence,  we  shall  only  study 
the  2  state  situation  where  both  transition  probabilities  are  independently 
Beta  distributed 

The  mechanism  assumed  will  again  be  as  follows.  We  select  a 
value  for  each  of  the  transition  probabilities  from  its  Beta  distribution. 
With  these  values  we  calculate  the  associated  exactly  known  mean  first 
passage  times  (n_) .  We  repeat  this  process  over  and  over  obtaining 
a  whole  series  of  values  for  the  mean  first  passage  times.  We  want  to 
obtain  analytic  expressions  for  the  long  run  average  values  of  these  series 
of  mean  first  passage  times;  i.e.,  for  E(n„)  s  n^;  the  expected  mean 
recurrence  times. 

Consider 


where 

*aU)  =  fp(xlm1>n1) 

and 

fb<Y)  =fp(y|m2,n2). 

Then,  as  shown  in  Appendix  N,  the  expected  mean  recurrence  times 
are  given  by 
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(4.5) 


There  are  several  comments  that  can  now  be  made.  First, 
unlike  the  expected  values  of  the  steady  state  probabilities  where  we 
required  the  use  of  the  hypergeometric  function,  equation  (4.  5)  reveals 
that  the  E(m.)'s  are  very  simple  functions  of  the  m„'s.  Secondly,  to  cal¬ 
culate  E(n^),  although  it  is  a  function  of  "a",  we  only  require  a.  A 
similar  comment  can  be  made  about  E(n22)  and  "b".  Finally,  the  fol¬ 
lowing  can  be  said  about  the  sensitivity  of  the  E(n_) ’s  to  the  variances  of 
the  Beta  distributions.  The  terms  1  -  l/(m  +n  )  and  1  -  l/(m  +n  ) 

11  la  L* 

rapidly  approach  1;  hence,  the  terms  that  determine  the  sensitivity 
are  a  -  l/(m  +n  )  and  b  -  l/(m  +n_,).  If  a  is  not  too  small,  a  -  l/(m 

11  La  la i  1 

+n^)  quickly  approaches  a.  Therefore,  if  a  is  not  too  small,  both  E(n^2) 
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Figure  4.  4: 


E(n..)'s  as  Functions  of  the  Beta  Parameters. 
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and  E(n^^)  are  insensitive  to  +  n^i.e.,  to  the  variance  of  the  Beta 

distribution  on  "a")  provided  +  n^  is  somewhat  larger  than  l/(a). 

Similarly,  if  b  is  not  too  small,  both  E(n  )  and  E{n  )  are  insensitive 

Lti.  11 

to  +  n^  (i.e.,  to  the  variance  of  the  Beta  distribution  on  "b"),  pro¬ 
vided  +  n^  is  somewhat  larger  than  l/(b).  This  behavior  is  illus¬ 
trated  in  Figure  4.4,  which  presents  plots  of  the  E(n  )  *s  as  functions 

U  _ 

of  +  n^  and  m^  +  n^  for  the  numerical  example  where  a  =  0.  1 
and  b  -  0.5.  b  is  not  too  small,  hence  E{n.^ )  and  E(n^)  are  seen  to 
be  very  insensitive  to  +  n^  as  soon  as  m^  +  n  gets  slightly  away 
from  the  value  2.  On  the  other  hand,  the  low  a  value  causes  E(n  ) 

1  Lt 

and  to  be  fairly  sensitive  to  +  n  for  a  large  range  of  + 

V 

It  has  been  demonstrated  that  for  a  2-state  Markov  process 
whose  transition  probabilities  are  independently  Beta  distributed  it  is 
a  simple  matter  to  obtain  the  expected  values  of  the  mean  recurrence 
times.  As  the  latter  quantities  are  occasionally  of  importance  in 
physical  situations,  this  is  a  worthwhile  discovery. 

4.  3.  State  Occupancy  Times 

4.  3.  1.  The  Probability  Mass  Function  of 
the  Occupancy  Time 

Again,  consider  an  N-state  Markov  process  with  transition 

matrix  P  =  (p. .)  where  the  p..'s  are  multidimensional  Beta  distributed 

with  parameters  m  .  Under  these  circumstances  we  have  shown  in 
ij 

section  3.  2  that  the  marginal  distribution  on  p..  is  given  by 


Let  u,  be  the  number  of  the  transition  on  which  the  system 
i 
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leaves  state  i  for  the  first  time  given  that  it  is  in  state  i  before  the 
first  transition.  Then  it  is  well  known  that 

Pu  |  (k|x)  =  xk  1  (Ux)  k  ^  1  (4. 6) 

i  Mi 


Now  we  use  the  same  mechanism  as  earlier.  Select  an  x  value 
from  the  p..  distribution  and  determine  the  corresponding  probability 
mass  function  on  u..  This  is  repeated  a  large  number  of  times  and  we 
would  like  to  know  the  resulting  marginal  probability  mass  function  on  u.. 


Pu  =  r  Pu  Ip  fp  dx 

i  „  i  Pii  Pii 


Section  0. 1  of  Appendix  O  reveals  that 


rn  -  ,  pii  pii+wi 

(  -pii*  \l+wj\l+2w.( 


Pu  (k) 
1 


11  -Pu’ 


rp..+<k-2)w.! 
l  l+(k-J)w.  j 


k  >  2 


k  =  1 


(4.7) 


where 


ii  ,  1 

P..  =  -  and  w.  = 

ii  N  i 


1  mij 

j=l 


N 

I 

3=1 


ij 


(4.8) 


NOTE: 


For  p, .  exactly  known  at  p..  the  quantity  w.  =  0  and  equation 
(4.7)  reduces  to  equation  (4.6),  as  it  should. 
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It  is  interesting  that  the  probability  of  exit  on  the  very  next 

transition  (k=l)  is  independent  of  the  sum  of  the  m  ,'s  and  only 
_  ij 

depends  on  p...  Furthermore,  it  is  now  no  longer  obvious  that  E(u.)  is 
finite  (as  in  the  case  when  p„  is  exactly  known). 

Equation  (4.  7)  was  used  to  develop  the  curves  of  Figures  4.  5 

and  4.6. 


4.  3.  2  The  Mean  Occupancy  Time 

When  p_  is  exactly  known  the  mean  occupancy  time  is  given  by 

(1-p..).  However,  when  p..  is  Beta  distributed  as  assumed  here,  the 
'll  H 

probability  mass  function  of  the  occupancy  time  is  as  developed  in  equa¬ 
tion  (4.  7).  Although  we  cannot  obtain  the  exact  value  of  E(u.),  sec¬ 
tion  O.  2  of  Appendix  O  reveals  that  the  mean  value  of  the  occupancy 
time  can  be  bounded  as  follows: 


(4.9) 

The  upper  bound  is  not  very  tight  but  it  does  serve  the  important  pur¬ 
pose  of  showing  that  the  mean  value  of  the  occupancy  time  is  finite.  This 
is  not  obvious  from  just  looking  at  the  probability  mass  function  (equa¬ 
tion  (4.  7)).  The  lower  bound  of  equation  (4.9)  is  obtained  by  merely 
truncating  the  summation 

00 

I  k  Pu  <k) 

k=l  1 

at  a  finite  number  (r)  of  terms. 
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Figure  4,  5:  The  Probability  Mass  Function  of  State  Occupancy 

Time,  P  (k),  for  Fixed 

u.  ij 
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Figure  4.  6:  The  Probability  Mass  Function  of  State 

Occupancy  Time,  p  (k),  for  Fixed  "p^.. 

i 
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Although  a  closed  expression  could  not  be  found  for  the  mean 
occupancy  time,  recall  that  we  were  able  to  obtain  the  probability  mass 
function  of  the  occupancy  time  when  the  transition  probabilities  are 
multidimensional  Beta  distributed.  This  appears  to  be  the  one  impor¬ 
tant  quantity  that  is  analytically  calculable  in  an  N  state  Markov  process 
when  the  transition  probabilities  are  multidimensional  Beta  distributed. 


Numerical  Example 
Suppose 


f  (x)  =  fA(x|  5,  45)  i.  e.  ,  m..  =  5,  )  m.,  =  50 

Pu  P  11  L  d 

j 


Using  equation  (4.9).  the  upper  bound  is  given  by 


50.  1 


The  lower  bound  for  r  =  1  is  0.  90 

r  =  2  is  1.08 
r  =  3  is  1 . 11 


r  =  4  is  1.12 


Therefore,  1.12  <  E(u^)  <50.1,  a  rather  wide  bound,  but  as  stated 
earlier,  the  upper  bound  does  serve  the  useful  purpose  of  assuring  us 
that  E(u^)  is,  indeed,  finite.  For  p..  exactly  known  at  5/(50)  =0.1, 
the  mean  occupancy  time  would  be  1  /{ 1  -0 .  1 )  or  1.11. 
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4.4.  Transient  Behavior 


Again,  let  the  mechanism  be  as  follows.  Values  of  the  trans¬ 
ition  probabilities  are  drawn  from  their  distributions  and  the  multi- 
step  transition  probabilities  are  evaluated  conditional  upon  these  exactly 
known  values  of  the  transition  probabilities.  If  this  process  is  repeated 
a  large  number  of  times,  what  can  be  said  about  the  unconditional 
multi-step  transition  probabilities? 

Define 


4>.  .(n  |  P)  =  pr{  state  at  time  n  is  j  |  state  at  time  0  is  i 
1J  and  the  matrix  P,  with  exactly  known 

transition  pro' abilities,  is  being  used} 


and 


<(>_(n)  =  pr  {state  at  time  n  is  j  |  state  at  time  0  is  i} 


As  an  example  consider  the  2-state  process  with 
II  -  a  a  ~1 


P  = 


b 


1  -  b 


where  "a"  and  '*b"  are  distributed  according  to  and  f^y). 


1 

<t>12<l)  =  J  x  fjx)  dx  =  a, 
0 


the  mean  of  the  distribution  on  "a”.  Similarly, 


<^(1)  =  1  -  a,  <|>21  =  b,  and  4>22(1)  =  1  -  b 
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However, 


1  1 

~  [d-x)x+  x(l-y)]  f^(x)  fp(y)dxdy 

0  0 

1  1 

=  [  J  [2x  -  x2  -  xy]  fa(x)  fp(y)dxdy 

"o  0 

=  2a  -  a2  -  ab  ^  2a  -  (a)2  -  ab. 

The  second  moment  of  the  a  distribution  has  now  become  important. 
More  generally, 

1  1 

=  j  j  V"ix,y)  fa(x)  £b(y)dxdy 
0  0 

i.  e. , 

__  jjf 

4>ij(n)  =  [4>_(n|a,  b)]  (4.10) 

i  T  k  ir 

where  #  denotes  that  aJ  has  been  replaced  by  aJ  and  b  by  b  .  This 

result  is  useful  for  the  2-state  case  where  4>  j(n|(p_))  has  been  obtained 
in  closed  form.  For  more  than  2  states  there  is  no  closed  form 
expression  for  <t>_(n  |  (p_))  in  terms  of  power  series  in  the  transition 
probabilities.  (The  3-state  situation  is  illustrated  in  Appendix  F. ) 
However,  for  3  or  more  states  for  a  specific  n  we  could  find  <p  .(n)  as 
a  power  series  in  the  transition  probabilities  by  obtaining  the  i-j  elem- 
ent  of  P  through  actually  raising  P  to  the  n  power  (where  the  transi¬ 
tion  probabilities  would  be  left  in  symbolic  form.)  Power  series  are 
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required  to  allow  us  to  replace  p. .  by  p. .  .  For  the  2-state  situation. 

ij  _  ij 

equation  (4.  10)  was  used  to  develop  41.  .(n)  for  Beta  distributions 

1J  u 

over  "a"  and  "b".  In  the  2-state  case  we  have 


$(n|a,b)  =  (<j>_(n| a,  b))  = 


b 

a 

a 

-a 

a+b 

a+b 

+  (1 -a-b)n 

a+b 

a+b 

b 

a 

-b 

b 

a+b 

a+b 

a+b 

a+b 

1  k 

This  does  not  appear  to  be  expressible  in  terms  of  aJ  and  b  .  How¬ 
ever,  for  any  specific  n  this  can  be  achieved  by  expanding  the  above 
expression.  For  example,  suppose  n  =  3,  then 


4»12(3|a,b) 


a 

a  +  b 


+  (l-a-b)3 


=  [1  -  1  +  3(a+b)  -  2(a+b)2  +  (a+b)3] 


=  a[3  -  3(a+b)  +  (a+b)2] 

=  a3  +  a2(-3+2b) +  a(3-3b+b2) 

'  ’•  ^12(3)  =  a3  +  a2(-3+2b)  +  I(3-3b+b2). 


Expressions  of  this  nature  were  developed  for  n  =  1,  2,  .  .  .  ,  10  and 
then,  through  the  use  of  a  computer  program,  <|>j  (n)  n  ”  2,  •  •  •  >  10 

were  evaluated  for  the  following  combinations 


17 


Ibid 
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a  =  0. 1,  0.5,  0.  9 


b  =  0. 1,  0.5,  0.9 
rrij  +  rij  =  10,  50,  200,  500,  » 

=  10,  50,  200,  500, 

where 

fjx)  =  fp(x  | nij ,  n j) 

and 

fb(y) =  fp(y  I  m2’  n2)* 

Typical  curves  are  presented  in  Figure  4.7  for  the  case  a  =  0.5,  b  =  0.5. 
The  entire  results  (which  were  too  detailed  to  be  included  even  in  an 
appendix)  can  be  summarized  by  a  few  remarks.  For  the  2-state  process 
with  "a"  and  "b"  exactly  known  we  know  that  the  steady  state  probabili¬ 
ties  will  be  approached  in  an  oscillatory  manner  if  and  only  if  a  +  b  >  1 
1  8 

(see  Howard  ).  Empirically,  at  least,  the  behavior  when  both  transi¬ 
tion  probabilities  are  independently  Beta  distributed  can  best  be  des-' 
cribed  by  the  following  flow  chart: 


(please  see  next  page) 


18 


Ibid. 
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Figure  4.8.  Oscillation  in  the  transient  behavior  of  2-state  processes 
when  both  transition  probabilities  are  independently  Beta 
distributed. 


4.5,  A  Simple  Trapping  States  Situation 
4.5.1.  3-State  Example 

Consider  the  extremely  simple  Markov  process  whose  transition 
matrix  and  flow  graph  are  as  follows: 
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where  q  and  q  are  multidimensional  Beta  distributed,  i.e., 

1  Lt 


f  „  (x,y)  =  f-(x,y|m  ,m  ,m  ) 
Vq2  P  2  3  1 


1 


m,  -1  m,  - 1  m  -1 
1  2  3 

y 


0  S  x,  0  £  y,  x  +  y  £  1 . 


Let  be  the  probability  that  the  process  traps  in  state  j.  Then 


Appendix  P  reveals  the  following  simple  results: 


m. 


E(g2}  =  m,  +  m. 


and  E(g  )  = 

j  m  _  +  m  „ 


Bayes  Modification 

Since  the  process  traps  in  state  2  or  3  we  must  think  of  a  large 
number  of  units  starting  in  state  1  for  a  Bayes  approach  to  have  any 
meaning. 

A  priori  distribution 

f  (x,  y)  =  f  (x,  y|  m  ,  m  ,  m  ) 

q.  >  q,  P  c  j  l 
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Suppose  we  observe 


transitions  which  return  to  state  1 
n^  transitions  which  go  to  state  2 

n^  <i  >>  ••  "  "  3 

Then  the  a  posteriori  distribution  is 

f  (x,v)  =  f„(x,  y  |  m  +n  ,  m  +n  ,  m  +n  ) 

Vq2  P  2  2  3  3  1  1 

and  the  a  posteriori  expected  values  of  the  probabilities  of  trapping  in 
states  2  and  3  are 


E(g2) 


m2  +  n2 


m2  +  m3+n2  +  n3 


and 

m.  +  n, 

E(g  )  =  - - - - - 

3'  m  +  m  +  n_  +  n 
2  3  2  3 

Note  that  1-1  transitions  have  no  effect  on  E(g  )  and  E(g  ).  However, 

L. 

they  would  change  our  feelings  as  to  the  number  of  transitions  for  a 
trap  to  occur. 


4.5.2.  Generalization  to  N  States 

The  generalization  to  N  states  is  accomplished  without 
difficulty. 
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- 1  qi  ql 


P  =  0 


1  0 


0  1 


The  a  priori  density  function  is 


fqi>...,qN_l(Xl’---’XN-l)  =  VXi . XN-l|m2’m3’*‘-mN'ml) 


E(g.)  = 


j  =  2,  3,  .  .  . ,  N 


1-* 


Suppose  we  observe  n.  transitions  from  state  1  to  i,  i  =  2,  .  . .  ,  N. 
Then  the  a  posteriori  expectation  of  trapping  in  j  is  given  by 


m.  +  n. 
J  J 


^  (m.+n.) 


It  is  realized  that  the  trapping  state  example  dealt  with  here  is 
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practically  trivial.  However,  it  does  give  some  insight  into  the  prob¬ 
lems  encountered  when  the  transition  probabilities  are  not  known 
exactly.  Furthermore,  for  more  complicated  trapping  situations  we 
immediately  become  involved  in  the  same  sort  of  predicament  as  was 
encountered  in  the  study  of  the  steady  state  probabilities  when  the 
process  contained  more  than  2  states. 


-91- 


CHAPTER  5 


STATISTICAL  DECISIONS  IN  MARKOV  PROCESSES 
WHEN  THE  TRANSITION  PROBABILITIES  ARE 
NOT  KNOWN  EXACTLY 


Before  starting  this  chapter  it  is  suggested  that  the  reader 
scan  section  5.6  to  obtain  an  overall  idea  of  the  goals  of  the  chapter 
and  how  they  tie  in  with  those  of  the  previous  two  chapters. 

5.1.  The  Importance  of  the  Expected  Values  of  the  Steady  State 
Probabilities 


Consider  an  N-state  Markov  process  with  a  given  transition 
matrix  P.  Let  the  corresponding  steady  state  probabilities  be  repre¬ 
sented  by  £=  [;r^,  r^].  Suppose  that  there  is  a  reward 

vector  £=  [r^,  r .  .  .,  r^],  where  r^  =  reward  for  being  in  state  j  for 
one  time  period.  Then,  in  the  steady  state  the  expected  reward  per  time 
period  is 


R  = 


l 


j=l 


7T.  r . 

J  J 


Now,  if  the  p.  's  are  random  variables,  the  r.'s  and,  therefore, 
!J  J 

also  R  become  random  variables.  But 


E(R) 


N 


l 


r.  E(tt.) 
J  J 


(5.1) 


The  quantity  E(R)  is  central  to  many  decision  processes.  Hence,  it  is 
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essential  that  we  be  able  to  evaluate  it  for  convenient  distributions  over 


the  p./s.  However,  equation  (5. 1)  reveals  that  we  must  determine 
the  E(7i\)'s  to  obtain  E(R).  This  was  the  motivation  for  such  careful 
study  of  the  E{7[\)'s  for  multidimensional  Beta  distributions  over  the 
transition  probabilities  in  section  4.1.  The  reasoning  behind  using 
multidimensional  Beta  distributions  has  been  discussed  at  length  in 
section  3.  5. 


5.2.  A  2 -State  Problem 

5. 2.  1.  Description  of  the  Problem 


Consider  a  2-state  Markov  process  with  transition  matrix 


a 


1  -  b 


where 


M  II 


a 


is  exactly  known  but 


lbW  =/p<*l |. m,n)  = 


x"-1  (l-x)"-1 


0  <  x  <  1 


Let 

c  be  the  fixed  cost  per  time  period  for  using  the  process, 
r j  be  the  reward  per  time  period  for  being  in  state  j  (j=l,  2) 

and 

d  be  the  cost  of  observing  a  transition  from  state  2  regardless 
of  the  state  to  which  it  goes. 

Suppose  that  we  have  the  option  of  buying  the  right  to  observe  k 
transitions  from  state  2  (the  outcomes  of  which  would  modify,  in  a 
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Bayes  sente,  the  probability  distribution  over  "b").  For  the  problem 
to  have  economic  meaning,  let  us  consider  that  one  of  the  following  two 
conditions  exists. 

i)  The  process  will  only  be  run  for  s  periods  starting  in  the 
steady  state, 

or 

ii)  The  process  will  run  indefinitely  but  there  is  a  known  dis¬ 
count  factor. 

The  second  condition  would  be  more  likely  to  exist  in  the  real' 
world  than  the  first. 


5.  2.  2.  The  Situation  Where  the  Observations 
Do  Not  Affect  the  Decision  Procedure 

Intuitively,  it  certainly  would  not  be  worthwhile  to  pay  for  obser¬ 
vations  if  they  cannot  affect  the  decision  procedure.  However,  in  the 
research  this  fact  was  overlooked  at  first  and  some  rather  interesting, 
non-trivia[l  results  Were  developed.  Xhfey  are.  theireJere*.  included 
in  this  section  and  Appendix  Q. 

Let  E  be  the  event  that  in  k  transitions  from  state  2  r  of 

r  ,  19 

them  go  to  state  1.  Then,  according  to  Raiffa  and  Schlaifhr 


pr(Ey  |  m,  n,  k)  =  p^vr  i  m,  n,  k) 

(5.2) 

(mtr-1).'  (n+k-r-1).1  k!  (m+n-i).1  (r=0, 1,  . .  , ,  k) 

r.'  (m-l)i  (k-r)!  (n-l)J  (m+n+k-1),' 


This  is  the  Beta-binomial  distribution.  Also,  we  know,  ffom -section  3.4 
that 


19 


Op.  cit.  i 


p.  265. 
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VX^Er^  =  fp(xlrn+r>  n+k-r), 

a  new  member  of  the  same  Beta  family. 

Assume  that  the  process  will  be  used  for  s  periods  in  the 
steady  state.  If  the  Beta  parameters  are  m  and  n  and  the  other  transi¬ 
tion  probability  is  "a",  then  the  expected  net  revenue  is  given  by 

E(N.  R.  J  m,  n,  a)  =  s[r^  E(7r^  |  m,  n,  a)  +  r^  E(tt2  |  m,  n,  a)  -  c] 

=  str1  -  c  +  (r^ -r i )  E(7r2  |m,  n,  a)]. 

From  the  results  of  section  4.  1 .  3  and  Appendix  G 

E(N.R.  I  m,  n,  a)  =  s  r  -  c  +  (r  -r  )  — 7—7  F(  1 ,  n  m+n 

|_1  Z  1  a  +  1  \  I 

(5.  3) 

Also  define  E(N.R.  |m,  n,a;k)  to  be  the  expected  net  revenue  (prior  to 
the  observations  and  excluding  the  cost  of  observation)  if  m  and  n  are 
the  Beta  parameters,  "a"  is  the  other  transition  probability  and  k 
observations  are  to  be  taken.  Then 

E(N.  R.  |  m,  n,  a;  k)  =  ^  pr  (E^  |  m,  n,  k)  E(N .  R.  |  m+r,  n+k-r,  a) 


E(N.  R.  |m,  n,  a;  k) 


k 


1  pPb(r 


rn,  u,  k)  E(N.R.  |  m+r ,  n+k-r ,  a) 


(5.4) 


In  section  Q.  1  of  Appendix  Q  it  is  shown  by  algebraic  manipu¬ 
lation  involving  equations  (5.Z),  (5.3)  and  (5.  4)  that 
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(U.R|n,o+k,aj 


E  h+I 


E  (w-R.(n+  k,n,a) 


x  -  decision  node 
0  -  chance  node 
*-  ~  indica+es  decision 
+o  take  ( as 
proved  in 
Append  i  X  Q.) 


Figure  5.1.  A  2-state  decision  problem  where  the  observations 
do  not  affect  the  decision. 


E(N.  R.  |  m,  n,  a;  k)  =  E(N.  R.  |  m,  n,  a), 

the  result  that  was  intuitively  anticipated  at  the  beginning  of  this  section. 

An  alternate  proof  is  presented  in  section  Q.  2  of  Appendix  Q 
through  the  use  of  a  theorem  that  is  of  interest  for  decision  theory  in 
gene  ral. 


-96- 


5.2.3.  The  Situation  Where  the  Observations 


Do  Affect  the  Decision  Procedure 

Now  let  us  look  at  the  more  realistic  situation  where  we  shall 
decide  to  use  the  process  only  if  E(N.R')  is  positive,  otherwise  we  do 
not  use  it  and  the  net  revenue  is  zero.  (The  extension  to  the  situation 
where  E(N-R-)  must  be  greater  than  some  fixed  constant  is  trivial.) 
Stated  in  another  way,  we  shall  act  so  as  to  maximize  the  expected 
profit.  Now  experimentation  may  be  worthwhile.  The  case  where  the 
process  can  be  used  for  s  periods  (in  the  steady  state)  will  be  con¬ 
sidered.  The  rest  of  the  required  information  is  as  outlined  in  sec¬ 
tion  5.2.1. 

The  new  decision  problem  is  shown  in  Figure  5.  2. 


Figure  5.  2.  A  2-state  decision  problem  where  the  observations 
do  affect  the  decision. 
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It  is  clear  that  now 


k 

J 

E(N.R.  |m,n,a;k)  =  ^  p^r  |m,  n,  k)  max[0,  E(N.  R.|m+r ,  n+k-r,  a)] 

r  =0 


and  we  select 

"Buy  observations"  if  [E(N- R*|m,  n,  a;  k)  -  kd]  is  greater  than 
both  0  and  E(N.R.  |m,  n,  a), 

"Use  process  without  observations  "  if  E(N.R.  |m,  n,  a)  is 
greater  than  both  0  and  [E(N.  R.  |m,  n,  a;  k)  -  kd], 

"Stop"  if  both  [E(N.  R.  jm,  n,  a;  k)  -  kd]  and  E(N.  R.  |  m,  n,  a) 
are  less  than  0. 


Numerical  Example 


Pierre  and  Louie  are  contemplating  a  5  day  fishing  trip, 
several  months  hence,  in  the  wilds  of  northernQuebec.  They,  their 
gear  and  canoe  will  be  flown  in  andcout  of  the  wilderness.  However, 
realizing  that  there  is  a  considerable  expense  involved  they  have 
decided  to  do  some  rapid  calculations  before  making  the  trip.  In  fact, 
they  will  only  make  the  trip  if  their  expected  net  revenue  is  positive. 

Including  the  transportation  costs  the  expenses  per  day  would 
be  $40.  They  are  willing  to  put  a  dollar  value  on  the  pleasures 
derived  from  the  fishing,  etc.  However,  the  value  is  a  function  of  the 
weather.  On  a  reasonably  sunny  day  the  value  is  $60,  on  a  cloudy  or 
rainy  day,  $20. 

They  are  convinced  that  the  day-to-day  weather  can  be  repre¬ 
sented  as  a  Markov  process  with  transition  matrix 
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s 


R 


Because  the  fishing  area  is  so  far  in  the  wilderness,  data  on  its  weather 
conditions  is  extremely  difficult  to  obtain.  Pierre  wrote  to  the  Domi¬ 
nion  Weather  Bureau  asking  for  the  number  of  occurrences  on  record 
of  the  transition  couplets  (SS)>  (SR),  (RS),  (RR).  (SS  stands  for  sunny 
day  followed  by  sunny  day.)  The  government  official  sent  back  informa¬ 
tion  on  only  (SS)  and  (SR).  Furthermore,  he  has  stated  that  any  addi¬ 
tional  weather  information  of  this  nature  will  be  charged  at  a  fixed 
cost  ($0.  20)  per  transition  couplet. 

From  the  (SS)  and  (SR)  information  together  with  their  prior 
knowledge,  Pierie  and  Louie  are  satisfied  with  saying  that  "a"  is 
exactly  known  at  0. 9;  i.e.  ,  a  rainy  day  is  very  likely  after  a  sunny 
one.  After  talking  with  friends,  reading  geography  books,  etc.,  they 
are  willing  to  assume  that  "b"  is  Beta  distributed  with  parameters  m  =  9, 
n  =  1 ;  i.e.,  again  a  mean  of  0.9. 

Pierre  is  not  willing  to  spend  any  more  money  on  weather 
information.  Louie  wants  to  buy  one  piece  of  information,  namely  the 
weather  on  a  day  after  one  specific  rainy  day  (i.e.  ,  the  outcome  of  a 
transition  from  the  R  state).  Which  man's  decision  is  preferable? 

Using  the  symbols  of  section  5.2.1, 
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a  =0.9, 


m  =  9, 


n  =  1 


k  =  1  (number  of  possible  transitions  observable) 
d  =  So.  20  (cost  of  each  transition) 
c  =  $40  (cost  per  period  for  utilizing  process) 
r :  =  $60  (reward  per  period  for  being  in  state  1  or  S) 

r^  =  $20  (reward  per  period  for  being  in  state  2  or  R) 


Ppb(0|9.!,l)  =  0.1 
Ppb(l  |9,1.1)  =  0.9 


using  equation  (5.  2) 


Equation  (5.3)  gives 

E(N.R.  |m,  n,  a)  =  a[r1  -  c  +  (r^-^ )  E^  |  m,  n,  a)] 

Using  this  expression  together  with  the  method  of  evaluating  Etrr^lm.n.a) 
by  means  of  the  hypergeometric  function  (outlined  in  section  4.  1 . 3)  we 
obtain 


E(N.R.j9,  1,0.9)  =  -$0.2740,  the  expected  net  revenue  if  they 

go  fishing  without  any  more  infor¬ 
mation. 

E(N. R.  1 9, 2,0.9)  =  -$5.2320 
E(N.  R.  1 10, 1,0.9)  =  $0.2760 

The  equivalent  structure  to  Figure  5.2  is  presented  in  the  following  dia¬ 
gram  which  gives  the  solution  to  the  problem. 

In  words,  they  should  buy  the  one  observation  with  an  expected  net 
revenue  of  $.0484.  If  the  transition  is  from  state  2  to  state  2  (i.e., 

RR),  they  should  not  go  fishing.  However,  if  it  is  from  state  2  to 
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$>0.2484  is  the  expected 
net  revenue  after  they 
have  bought  the  one  piece 
of  information. 


state  1  (i.e.,  RS),  they  should  go  fishing. 

Indefinitely  Long  Use  of  the  Process 

Suppose  that  instead  of  being  able  to  use  the  process  for  s 
periods  in  the  steady  state  we  can  use  it  indefinitely  long  starting  x 
periods  from  now  in  the  steady  state.  (This  is  probably  more  meaning¬ 
ful  in  most  practical  situations.)  The  discount  factor,  a,  is  assumed 
known.  Then,  the  analysis  is  identical  to  the  above  except  we  now  use 
expected  discounted  present  values  instead  of  the  expected  net  revenues. 
Utilizing  a  process  with  E(R|m,n,a)  would  give  a  present  value  of 
rewards 
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p.  v.  (m,  n,  a)  =  E(R  |  m,  n,  a)[|aX  +  qX+1  +  .  .  .  ] 


=  -r— -  E(R  |  m,  n,  a). 

1  -  a 


5.3.  The  Corresponding  N-State  Problem 


It  was  demonstrated  in  section  5.  1  that  the  determination  of  the 
expected  values  of  the  steady  state  probabilities  is  essential  in  obtaining 
the  expected  reward  per  period  in  the  steady  state,  a  quantity  very  useful 
for  statistical  decision  purposes.  For  2-state  Markov  processes  we  were 
able  to  show  analytically  (in  sections  4.  1.  3  and  4.  1. 4)  that  for  Beta 
distributed  transition  probabilities  the  expected  values  of  the  steady 
state  probabilities  are  very  insensitive  to  the  variances  of  the  Beta 
distributions  (for  fixed  mean  values).  For  3  or  more  states  with  multi¬ 
dimensional  Beta  distributed  transition  probabilities  we  have  arrived  at 
the  same  conclusion  by  means  of  empirical  results  obtained  through  the 
use  of  simulation  techniques  (in  sections  4.1.6  and  4.  1 . 7).  Consequently, 
for  given  multidimensional  Beta  distributions,  we  may  quickly  obtain  a 
good  approximation  to  the  E(fl\)'s  in  the  following  manner:  Assume  that 
the  transition  probabilities  are  exactly  known  at  the  mean  values  of  the 
multidimensional  Beta  distributions,  then  use  these  exact  values  to 

determine  the  corresponding  steady  state  probabilities,  (n,)  's,  by 

J 

standard  techniques.  Finally,  use  the  ( 7T .)  's  as  an  approximation  to 

J  ex 

the  E(fl\)'s. 

If  the  particular  statistical  problem  requires  extremely 

accurate  values  of  the  E(fl\)'s,  it  is  always  possible  to  obtain  these  by 

simulation  techniques  (as  outlined  in  sections  4.  1 . 6  and  4.  1 .  7)  if 

the  ( 7r ,)  's  are  not  a  close  enough  approximation.  However,  being  able 
J  G  X 

to  avoid  simulation  in  a  large  scale  problem  would  save  considerable 
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computation  time. 


It  is  worthwhile  noting  that,  as 


N 

h,p)  =  2 

j=i 


r . 
J 


*.(P> 


is  a  pure  (as  defined  in  Appendix  Q)  function  of  the  p  .'s,  the  theorem 

lj 

of  section  Q.  2  of  Appendix  Q  immediately  tells  us  to  not  pay  for  obser¬ 
vations  if  we  cannot  make  the  decision  a  function  of  the  observed  transi¬ 
tions  . 


5.3.1.  Description  of  the  Problem 


Consider  an  N-state  Markov  process  whose  transition  proba¬ 
bilities  are  multidimensional  Beta  distributed  with  parameters  M  = 

(ml.  Let 
ij 

c  be  the  fixed  cost  per  time  period  for  using  the  process 
r j  be  the  reward  per  time  period  for  being  in  state  j(j  =  l,...,N) 

and 

d  be  the  cost  of  observing  each  transition 

Assume  that  the  following  three  alternatives  exist: 

i)  Stop  entirely  with  zero  net  revenue. 

ii)  Do  not  observe  the  process  anymore  and  decide  to  use  it 
for  s  periods  in  the  steady  state. 

iii)  Buy  observations  of  the  next  k  transitions,  then  either 
stop  or  use  the  process  (with  no  further  experimentation)  after  the  k 
transitions . 


5.  3. 2.  Analysis 

Let  B  be  the  event  that  the  actual  transition  sequence  is  known 
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and  the  transition  frequency  count  is  F  =  (f„)  where  f_  =  number  of 
transitions  from  state  i  to  state  j.  Fo*‘  large  k  the  number  of  such 
events  possible  becomes  very  large.  The  probability  of  event  B  can  be 
determined  by  considering  each  row  of  F  separately.  Using  the  final 
result  of  Appendix  D, 


pr(f. f..T  and  known  order)  = 
ll  iN 


P(mil+fil . miN+fiN) 


P(m 


U*mi2*  •  ••’miN) 


1.  e. , 


pr{f., ,  . .  .  ,  f...  and  known  order) 
ll  iN 


P[  /  m..  ]  n  P(m..+f. .) 

Vj.i  7#  y  'J 


N 


(5.5) 


r  n 

l 

j=i 


(m.  ,+f  .) 
U  hi 


N 

n  r  (m. .) 

J=i 


ij 


Then, 


N 


pr[F  and  known  order]  =  n  pr(f  , 

i=l  '  11 


.  ,  f„T  and  known  order) 
iN 


(5.6) 


Knowing  the  event  B,  we  can  quickly  determine  the  posterior 
density  functions  of  the  p.  's  through  the  use  of  Bayes'  rule.  As  out¬ 
lined  in  section  3.4,  the  p..'s  will  still  be  multidimensional  Beta 

ij 

distributed  except  that  the  parameters  will  now  be  M+  F  =  (m.  ,+f  .). 

ij  ij 

Thus  we  can  easily  calculate  the  expected  reward  for  using  the  process 
after  the  event  B  has  been  observed. 


-104- 


The  expected  net  revenue  without  experimentation  (i.e., 


observations)  is 


E(N.  R.  |(m..)) 


^Stop 

max'. 

S,  U 

^  Use 


where,  as  earlier, 


0 


s[E(R  j  (m_))  -  c] 


N 

E(R|(mij))  =  Z  rt  E^t|(mij)> 

t=l 


(5.7) 


is  the  expected  reward  per  time  period  in  the  steady  state  given  that  the 
transition  probabilities  are  multidimensional  Beta  distributed  with 
parameters,  M  =  (m_).  (The  act'll  calculation  of  E(R|(m_))  is  des¬ 
cribed  in  the  numerical  example  presented  in  the  next  section.) 

The  expected  reward  with  experimentation  is 

E(N.R.  |(m..);k)  =  y  pr(B)  mf.x[0 ,  s[E(R  |  (m  .),  B)  -  c]  ]  -  kd  (5.8) 
all  B 

If 

E(N.  R.  |(m..);k)  >  E(N.  R .  |(m.,)), 

we  elect  to  observe  the  transitions,  if  not,  we  do  not  buy  the  observa¬ 
tions. 

To  illustrate  the  difficulty  in  obtaining  all  possible  B’s  and  in 
calculating  their  probabilities,  the  following  3-state  example  is  pre¬ 
sented. 


-105- 


5.  3.  3.  Numerical  Example 


A  shady  carnival  man,  Gary,  always  looking  for  an  "honest" 
dollar,  has  asked  his  friend  Mark  if  he  would  be  interested  in  the  fol¬ 
lowing  game:  Gary  has  three  dice  —  1  red,  1  white  and  1  blue.  Each 
has  two  sides  marked  1,  two  sides  marked  2  and  two  sides  marked  3. 
Anytime  a  1  is  rolled  the  red  die  is  used  for  the  next  toss.  Similarly, 

2  and  3  force  the  use  of  the  white  or  blue  die,  respectively. 

Gary  has  made  the  game  appear  attractive  by  means  of  the  fol¬ 
lowing  cost  framework.  He  will  charge  Mark  $5.00  per  roll,  but  will 
pay  him  $4.00  for  each  1  rolled,  $5.00  for  each  2  and  $6.00  for 
each  3.  A  naive  bystander  quickly  figures  that(4.00  +  5.00  +  6.00,)/(3)  - 
5.00  and  hence  the  game  is  fair  [in  the  long  run. 

Gary  realizes  that  the  unknown  bias  of  each  die  is  quite  impor¬ 
tant  for  later  conquests  and  he  doesn't  want  Mark  to  learn  too  much 
from  playing.  Furthermore,  he  will  not  give  Mark  the  opportunity  of 
starting  with  a  particular  die.  Therefore,  he  will  only  let  Mark  play 
for  5  rolls  and  the  first  roll  will  only  be  after  Gary  himself  has  been 
rolling  the  dice  (according  to  the  prescribed  mechanism)  for  quite  some 
time  (there  is  a  third  party  present  who  will  keep  things  honest). 

To  make  things  even  more  confusing,  Gary  has  offered  Mark 
the  option  of  observing  two  rolls  beginning  with  the  red  die  at  a  cost 
of  d  per  toss.  Should  Mark  listen  to  the  bystander  and  play  the 
game?  Should  he  take  advantage  of  the  option? 

Unknown  to  Gary,  Mark  has  seen  a  few  rolls  of  the  dice.  Also, 
he  is  allowed  to  carefully  scrutinize  them.  From  these  two  sources  of 
information  he  feels  that  he  cannot  state  exact  values  for  the  transition 
probabilities,  but  he  is  willing  to  assume  that  they  are  multidimensional 
Beta  distributed  with  the  following  parameters: 
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R 

M  =  W 
B 


W  B 

10  5 

1  3 

4  6 


Using  the  notation  of  section  5. 3.  1, 

c  =  $5.00  r  =  [$4.00  $5.00  $6.00]. 

Without  using  the  option  Markus  expected  net  revenue  (using 
equation  (5.  7))  is 


max 


E(N.R.  (m. .))  = 

ij  S,  U 


(5.9) 


s[E(R|(m..))  -  c] 


Now 


s  [E(R  1  (m.  ))  -  c]  =  s[r  E(7r  |(m..))  +  r  E(tt  |(m  )) 

1J  L  i  1 1  C  C  1J 


ij 


(5.  10) 


+  r  ^  E(  7r3  I  <rnij>)  -  c] 

At  this  stage  we  use  the  approximation  (discussed  earlier)  that 

E( jt.  |(m. .))  ~{jt.)  , 

J  ij  J  ex 

the  exact  steady  state  probability  calculated  from  the  transition  matrix 
M1  whose  elements  are  the  means  of  the  multidimensional  Beta  distri¬ 
butions.  That  is, 
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M'  = 


m 


11 


mll+m12+m13 


21 


m21  +  m22  +  m23 


31 


m31  +  m32  +  m33 


12 


m 


22 


m21  +  m22  +  m23 


m 


32 


m31+m32  +  m33 


m 


13 


mll  +  m12  +  ml3  mll+m12  +  m13 


23 


m21  +  m22  +  m23 


33 


m31  +  m32  +  m33 


Here 


M'  = 


0.  250 

0.500 

0.  250 

0. 200 

0.  200 

0. 600 

0.  167 

0.  333 

0.500 

Using  the  steady  state  results  of  Appendix  F 


(tt.)  =0.194,(0  =  0.322,(0  =0.484 

1  ex  2  ex  3  ex 


Therefore,  from  equation  (5 . 10) 

s[E(r|(  m.j)  -  c  ]  ~  5[4(0. 194)  +  5(0.  322)  +  6(0.484)  -.5.  3] 


=  5[ -  .010]  =  -  $.050  <  0. 

Hence,  from  equation  (5.9) 

E(N.  R.  |(m..))  =  0 

and  Mark  should  stop  rather  than  playing  the  game  without  the  option; 
i.e. ,  Gary  has  duped  the  naive  bystander  (provided  the  latter  agrees 
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with  Mark's  prior  parameters). 

Suppose  the  option  of  observing  two  rolls  starting  with  the  red 
die  is  accepted.  Then  all  the  following  events  are  possible: 


B.  Ill  f. ,  =  2  ,  f. .  =  0  for  all  other  i,j  combinations. 

Ill  11 


BZ  112 
B3  113 

B  121 

4 

B  122 

5 

B,  123 

D 

B?  131 
B8  132 
B9  133 


fll=1*  f12 
fll  =  lf  f13 
f12  ^  1  *  f21 
f12  =  1‘  f22 
fl 2  =  1  ’  f23 
f13  =  1*  f31 
fl  3  f32 


f13  l’  f33 


=  1. 
=  1, 
=  1, 
=  1, 
=  1. 
=  1, 
=  1, 
=  1, 


Using  equations  (0.5)  and  (5.6) 


Pr(E,)  = 


T(20)  r  (7)  T-<10)  £45)  6x5 


r  r(22)  r(5)  r^io)  r^5)  21  x  20 


=  .07] 


r (20)  r(6)  r ( 1 1 )  gg)  _  5  x  10 
pn  2*  -  p<22)  r(5)  r(1Q)  -  21  x  20 


=  .119 


pr(B  1  =  K 

pr  3  r(2i)  r\<5)  r(io)  £&)  x 


r( 5)  gjlj  r(2)  £j3) 
T(6)  r^i)  r(D^) 


11 

20 


.  100 


similarly, 
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pr(B^)  =  •  060 

pr(B4)  =  .  100 

pr(B  ,)  =  .  300 
o 

pr(B?)  =  .042 
pr(Bg)  =  .083 
Pr(Bg)  =  .125 

After  each  experimental  outcome  we  have  a  new  matrix  of 
parameters  given  by  M  +  F;  e.g.,  after  B,  , 


Using  M^,  we  proceed  exactly  as  we  did  with  M  to  obtain 
s[E(R|  (m.j),  Bj )  -  c]  , 

Mark's  expected  return  for  playing  the  game  after  he  has  observed  two 
R-R  transitions.  This  is  repeated  for  each  experimental  outcome.  The 
results  of  the  calculations  are  shown  in  Figure  5.  3. 

From  that  figure  it  is  seen  that  it  pays  for  Mark  to  use  the 
option  of  observing  two  rolls  starting  with  the  red  die  if 

.  0325  -  2d  >  0 

i.  e.  , 

d  <  $.01625 

If  d  >  $.01625,  he  should  refuse  both  the  option  and  the  game. 

If  he  does  buy  the  option,  the  diagram  also  illustrates  that  he  should 
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x  -  decision  node 
O-  chance  node 
~  decision  t~o  mate. 

SH  -  E(*l-R-  from  -#,is  point  on  I  «  2  . 


Figure  5.  3.  A  3-state  statistical  decision  problem. 


play  Gary's  game  only  if  B^(RWB)  or  B^(RBB)  occurs.  These  two 
outcomes  modify  the  a  priori  parameters  sufficiently  in  a  favorable 
direction  to  make  the  game  worthwhile. 

5.4.  Further  Items  of  Interest  for  Statistical  Decision  Purposes 

In  this  section  several  other  items  that  are  of  interest  in  sta¬ 
tistical  decision  theory  are  presented  as  they  relate  to  the  Markov  proc¬ 
ess  problems  considered.  For  a  more  detailed  background,  the  reader 

,  20 

is  referred  to  Raiffa  and  Schlaifer  s  work. 

5.4  1.  The  Value  of  Perfect  Information 


Suppose  we  are  faced  with  a  decision  problem  where  there  is 

a  random  variable,  x,  involved  with  density  function  f  (x  ).  Assume 

x  o 

tlu.t  we  are  able  to  calculate  the  expected  net  revenue  of  the  optimum 

decision  procedure,  E(N.R.  |  f^jx^)  knowing  only  the  density  function 

of  x.  Now,  suppose  that  we  were  told  the  value  of  x,  say  x  .  Then 

o 

there  would  be  an  associated  expected  net  revenue  of  the  optimum 

decision  procedure  knowing  that  x  =  x  .  Call  this  quantity  E(N.R.  |x  ). 

o  o 

Then,  the  a  priori  expected  net  revenue  before  we  were  told  the  value 
of  x  would  be  given  by 

E(N.R.  |P.  I.)  =  f  E(N.  R.  |  x  )  f  (x  )  dx  (5.11) 

J:  oxoo 

x 

o 

and  the  expected  value  of  perfect  information  is  defined  to  be 

E.V.P.I.  =  E(N.  R.  |P.I.)  -  E(N.R.  |f  (x  ))  (5.12) 

X  o 


20 


Op.  cit.  ,  Chapters  4  and  5. 
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If  confronted  with  a  proposition  where  we  would  be  told  the  exact  value 
of  x  for  z  dollars,  we  would  accept  if  and  only  if  E.  V.P.I.  ^  z. 

This  concept  is  simplest  to  illustrate  for  the  2-state  decision 
problem  considered  in  section  5.2.1.  We  have  a  2-state  Markov  proc¬ 
ess  with  transition  matrix 


P  = 


1  -  a 


1  -  b 


where  "a"  is  known  exactly,  but  f,  (b  )  =  f_(b  |m,  n).  As  before, 

D'  O  P  O 

r . 

j 


b'Do'  o 

let  c  be  the  fixed  cost  per  period  for  using  the  process,  r.  be  the 
reward  per  time  period  for  being  in  state  j  (j  =  l,  2),  and  d  be  the 
cost  of  observing  a  transition  from  state  2. 

If  it  is  desirable,  we  can  use  the  process  for  s  periods  in  the 
steady  state.  Also,  we  have  the  option  of  buying  the  right  to  observe  k 
transitions  from  state  2. 

In  section  5.2.  3  it  was  shown  that  the  expected  net  revenue 
following  an  optimum  procedure  is 


E(N.R.  |f  (b  ))  =  max[0;  E(N.R.  |m,n,a);  E(N.R.  |m,n,a;k)] 
b  o 


(5.13) 


To  evaluate  E(N.R.  |P.I.),  Appendix  R  shows  that  we  proceed  as  fol¬ 
lows  : 

Case  i.  r^  <  c  <  r^ 

(r2-c)a/(c-r1) 

E(N.  R.  |  P.  I. )  =  j*  s['r1  -  c  +  (vVTTb  ]* 

0  °  (5.14) 

f„(b  I  m,  n)  db 
p  o  o 
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This  would  require  numerical  integration  or  an  extremely  involved 
analytic  integration.  Finally,  equations  (5.  13)  and  (5.14)  would  be 
used  to  evaluate 

E  .  V  P  I.  =  E(N.  R  |  P.  I.  )  -  E(N.  R.  If  (b  ))  (5.15) 

b  o 


Case  ii  r^  <  c  <  r^ 


1 

EIN.R.lP.L)  =  J  sEl  b’rfb] 

(c-r2)a/(r1  -c) 

•  f  (b  |  m.  n)  db 
p  o  o 


Then  proceed  as  above. 


(5.  16) 


Case  iii.  c  <  both  r^  and  r^  or  c  >  both  r^  and  r^ 

If  c  <  both  and  r^,  we  would  always  use  the  process, 
hence  the  E.  V.  P.  I.  =  0.  If  c  >  both  r^  and  r^,  we  would  never  use 
the  process,  hence  the  E  V.  P  I.  =  0. 

Numerical  Example 


Consider,  again,  the  Pierre-Louie  fishing  problem  of  sec¬ 
tion  5.  2.  3.  Therefore, 

a  =  0 . 9  m  =  9,  n  =  1 ,  k  =  1 ,  d  =  0.  2  s=5 
c  =  $40,  r^  =  $60  r^  =  $20. 

What  is  the  expected  value  (to  the  fishermen)  of  perfect 
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information  about  the  conditional  probability,  "b",  that  day  n  +  1  will 
be  sunny  given  that  day  n  is  sunny? 

It  was  found  that  E(N.R.  |  =  $.0484.  The  c  and  r's  are 

seen  to  correspond  to  Case  ii  above.  Therefore,  from  equation  (5.  16) 


1 

E(N.R.  |P.  I.)  =  j  5  j$20  -  $40 

20(0. 9)/(20) 


0-9  ~|  1 

0.9  +  bj  p(9,  1 ) 


b8  db 
o  o 


=  900 


r 

b8- 

o 


8  _ 


1.8  b 

_ o 

0 . 9  +  b 


0.9 


db 


To  integrate  the  second  term  we  use  the  substitution,  y  =  0. 9  4  b  , 

8  ° 

then  expand  the  resulting  (y-0.9)  term  in  the  numerator.  We  obtain 
E(N.  R.  |  P.I.)  =  $2.7648 
Hence,  from  equation  (5.  15) 

E.V.P.I.  =  2.7648  -  .0484  =  $2.7164 


This  is  the  maximum  amount  that  Pierre  and  Louie  would  be  willing 

to  pay  in  return  for  learning  the  exact  value  of  "b".  (The  transition 

probability  out  of  the  rainy  state.)  Note  that  the  E(N.R.  |f  (b  ))  value 

of  55.0484  is  conditional  upon  the  fact  that  in  section  5.2.  3  they  were 

allowed  to  pay  for  observation  of  one  transition  from  state  R.  Hence, 

the  above  E.V.P.I.  value  is  also  conditional  upon  that  fact.  If  the 

option  of  observing  was  not  available,  E(N.R.|  f  (b  ))  would  be  zero  as 

b  o 

shown  in  section  5.2.3  and  in  this  case 

E.V.P.I.  =  $2.7648  -  0  =  $2.7648 
These  ideas  can  be  extended  to  situations  where  we  want  to 
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know  the  value  of  perfect  information  concerning  2  or  more  random 
variables.  Presumably  we  would  already  know 


E(N.R.  |f 


Vx2* 


I  X, 


(y,  >y 


1  ’  y  2’ 


yk». 


the  expected  net  revenue  following  an  optimum  policy  when  only  the 
density  function  (and  not  the  exact  values)  of  the  random  variables 
Xj,  x^,  is  known.  Then,  equations  (5 . 1 1)  and  (5 .  1  2)  become 


E(N.R.  |P.I.)  =  J  ...J  E(N.R.  |Yl ,  y2. 


•  f 


X 


1 


x,  (yl . yk>  dyl  •••  dy 

k 


and 


E.V.P.I.  =  E(N.  R.  |P.I.)  -  E(N.  R.  |f  (y.,y, . y.  )) 

X1  ’  ‘ ’ Xk  1  C  K 

However,  in  practice  the  above  integration  would  usually  be  very  dif¬ 
ficult  to  perform. 

5.4.  2.  The  Choice  of  the  Number  of  Observations 
to  Take 


Suppose  that  we  are  given  a  choice  as  to  the  number  (k)  of 
observations  that  we  can  buy.  However,  the  stipulation  is  added  that 
we  must  decide  on  the  exact  number  before  any  observations  are  made. 
Physical  constraints  or  an  opponent  could  place  us  in  this  position.  This, 
in  effect,  prevents  the  problem  from  becoming  sequential  in  nature..  (A 
sequential  decision  structure  will  be  treated  in  the  next  section.) 

In  principal,  at  least,  the  optimum  number  can  be  determined 
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in  the  following  way.  For  any  particular  k  we  know,  from  earlier  in 
this  study,  how  to  determine  the  expected  net  revenue  (including  the 
cost  of  observation)  assuming  that  the  observations  will  be  made. 
Although  it  has  not  been  proved,  it  seems  reasonable  to  assume  that 
tlie  expected  net  revenue  would  be  a  unimodal  function  of  the  number 
of  observations  for  the  decision  framework  and  costs  considered. 

Hence,  we  determine  the  expected  revenue  for  a  number  of  values  of  k 
until  the  peak  is  detected.  The  corresponding  k  should  be  the  optimum. 

Numerical  Example 

For  the  same  fishing  example  as  that  considered  in  sections 
5. 2.  3  and  5. 4.  1,  the  expected  net  revenue  as  a  function  of  the  number 
of  observations  (transitions  from  the  rainy  state)  is  as  follows: 


k 

Decision 

E(N.R.  k) 

0 

Stop 

0 

1 

Buy  observation 

$.0484 

2 

Buy  observations 

$.1956 

3 

Buy  observations 

$.2280 

4 

Buy  observations 

$.1858 

5 

Buy  observations 

$.0916 

6 

Stop 

0 

7 

Stop 

0 

8 

Stop 

0 

9 

Stop 

0 

10 

Stop 

0 

The  behavior  of  the  expected  net  revenues  is  depicted  in  Figure  5.4.  In 
this  example  it  is  apparent  that  the  expected  net  revenue  is  a  unimodal 
function  of  the  number  of  observations.  The  optimum  number  of 


-117- 


Dollars 


Figure  5.4.  Numerical  example  of  the  choice  of  the 
optimum  number  cf  observations. 
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observations  of  transitions  from  rainy  days  is  three. 


5.4.  3.  A  Sequential  Decision  Structure 

Now  we  make  the  decision  structure  more  flexible  by  dropping 
the  assumption  that  all  k  observations  must  be  made;  i.e.,  we  can  now 
stop  observing  after  any  number  of  observations  up  to  k.  However, 
k  is  still  assumed  known  and  fixed  (perhaps  due  to  a  deadline  on  the 
time  of  our  decision  to  use  or  not  to  use  the  process).  The  solution  of 
the  new  sequential  decision  problem  is  accomplished  simply  by  the  solu¬ 
tion  of  a  number  of  old  single  observation  problems.  Although  the 
method  will  work  for  the  N-state  case,  it  is  easiest  to  present  using 
the  2-state  problem  where  "a"  is  known  exactly  and  "b"  is  Beta  distrib¬ 
uted.  In  fact,  the  problem  considered  will  be  identical  to  that  of  section 
5.  2.  3  except,  as  stipulated  above,  we  can  now  stop  observing  after  any 
number  of  observations  up  to  k. 

Define  v.(m,  n)  =  expected  net  return  if  there  are  j  possible 
observations  left,  the  parameters  of  the  Beta  distribution  are  m  and  n, 
and  an  optimum  policy  is  followed.  Then 

f  Observe  — — —  v.  ,  (m+1 ,  n)  +  — ^ —  v.  .  (m,  n+1 )  -d 

[—  m  +  nj-1  m  +  nj-1 

j  ^  1 

E(N.  R.  |  m,  n,  a)  (5.17) 

^Stop  0 

and 


max 


Vj,m'n)  =0.  u,s 


Use 
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/ 


.  .  max 

Vo(m-n)  =  U,s 


u 


(.s 


E(N.  R.  |  m,  n,  a) 


(5.18) 


The  determination  of  v  (m,  n)  is  seen  to  be  identical  to  finding  v/hether 
or  not  to  use  the  process  when  no  observation  is  possible.  This  was 
done  in  section  5.2.  3 .  Then  equation  (5.17)  is  used  recursively  to 
solve  the  rest  of  the  problem.  For  a  particular  (j,  m,  n)  triplet,  the 
solution  of  equation  (5.  17)  is  identical  to  solving  a  single  observation 
problem  except  that  E(N.  R .  |  m+1 ,  n,  a)  and  E(N.  R.  |  m,  n+1 ,  a)  are 
replaced  by  v  ^(m+l,n)  and  v  ^(m.n+l),  respectively. 

It  should  be  noted  that  because  of  the  sequential  nature  of  the 
decision  making  we  now  update  the  Beta  distribution  on  "b"  after  each 
observation.  In  an  N-state  problem  we  would  have  to  update  the 
appropriate  multidimensional  Beta  distribution  after  each  observation. 


Numerical  Example 

We  again  consider  the  same  fishing  example  as  in  sections  5.  2.  3, 

5.4.  1  and  5.  4.  2.  However,  now  we  assume  that  Pierre  and  Louie 
can  stop  buying  information  about  days  following  rainy  days  after  1 
or  2  observations.  The  corresponding  decision  tree  is  shown  in 
Figure  5.  5.  As  would  be  anticipated,  the  expected  net  revenue  of 
$.  21  52  is  higher  than  the  value  of  $.1  956  found  in  section  5.4.  2, 
where  they  were  forced  to  buy  both  observations  at  once. 

5.5.  Statistical  Decisions  for  a  Transient  Situation 


In  section  3.  4  we  outlined  how  to  determine  the  multi-step 
transition  probabilities  when  the  p,/s  are  random  variables  instead  of 
being  exactly  known.  It  was  shown  that  the  method  was  most  practical 
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for  2-state  processes.  Hence,  we  shall  restrict  our  attention  here  to 
2-state  situations. 

The  decision  framework  is  the  following.  We  have  a  2-state 
process  with  both  transition  probabilities  independently  Beta  distributed. 
The  costs  and  rewards  are  all  exactly  known.  The  crucial  question  is: 
"What  is  the  expected  net  revenue  in  the  next  h  time  periods  given  that 
we  shall  have  a  transition  from  state  i  just  before  the  first  period?" 

Let  us  denote  this  quantity  by  E(N.R.  h,i,(m..))  where  <m..)  are  the 

ij  ij 

parameters  of  the  Beta  distributions.  Then 

h  2 

E(M.  R.  |h,  i,  (m. .))  =  \  V  <t>.  .(n  |  (m. .))  r .  -  he  (5.19) 

1J  Lj  Lj  !J  J 

n=l  j  =  l 

where  <t>. j(n  |  (m_))  =  probability  that  the  process  will  be  in  state  j  at 
time  n  |  the  state  at  time  0  is  i  and  the  parameters  of  the  Beta  distri¬ 
butions  are  (m..);  and,  as  earlier 
ij 

r.  =  reward  per  period  for  being  in  state  j 

and 


c  =  cost  per  period  for  using  the  process. 

Now  there  is  no  need  to  elaborate  in  detail  on  the  statistical 
decision  problems  for  a  transient  situation  because  the  methods  are 
essentially  identical  to  those  for  the  cases  studied  earlier  where  the 
process  was  to  be  used  in  the  steady  state.  We  merely  replace 
the  E(N.R  |(m_))  of  the  steady  state  problem  by  E{N.R.  |h,  i,  (m_)). 
There  are  other  obvious  minor  modifications  such  as  reducing  h 
to  h  -  1  if  we  delay  using  the  process  for  1  period.  Hence,  we  can 
handle  all  the  following  situations  for  a  2-state  transient  problem  where 
the  transition  probabilities  are  independently  Beta  distributed 
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i)  Determination  of  the  expected  net  revenue  in  the  next  h 
periods  given  that  we  are  presently  in  state  i. 

ii)  Study  of  whether  or  not  to  delay  and  pay  for  observations 
before  deciding  to  use  or  not  use  the  process  (including  the  possibility 
of  sequential  decisions). 

iii)  Evaluation  of  the  optimum  number  of  observations  to  make 
before  deciding  whether  or  not  to  use  the  process. 

iv)  Determination  of  the  expected  value  of  perfect  informa¬ 
tion  about  one  or  both  of  the  transition  probabilities. 

Although  the  transient  statistical  decision  problem  has  been 
so  lightly  covered  due  to  the  fact  that  its  solution  is  very  similar  to 
that  of  the  steady  state  problem,  its  practical  importance  should  not 
be  underestimated.  In  fact,  the  transient  situation  is  probably  more 
likely  to  occur  in  the  real  world  than  the  steady  state  problems  con¬ 
sidered.  An  even  more  realistic  extension  would  be  the  situation  where 
the  process  could  be  used  indefinitely  long  with  discounting  starting  at 
the  very  next  transition.  Also,  we  have  not  considered  the  situation 
where  once  we  have  decided  to  use  a  process  we  can  change  our 
strategy  (i.e.,  make  further  decisions)  as  we  observe  the  process  in 
operation  during  the  time  in  which  we  are  using  it.  This  problem  is  of 
an  adaptive  control  nature. 

5 .  6.  Summary 

The  statistical  decision  framework  developed  for  a  Markov 
process  whose  transition  probabilities  are  not  assumed  exactly  known 
can  best  be  summarized  with  the  aid  of  the  block  diagram  of  Figure  5.  6. 

In  box  1  we  place  multidimensional  Beta  distributions  over  the 
transition  probabilities  of  the  Markov  process  considered.  The  pro¬ 
cedure  for  doing  this  was  discussed  in  section  3.5. 

Next,  if  the  process  will  be  used  in  the  steady  state,  the 
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expected  values  of  the  steady  state  probabilities  are  determined  as  out¬ 
lined  in  section  4.1.  For  the  special  2-state  process  where  one  transi¬ 
tion  probability  is  exactly  known  and  the  other  is  Beta  distributed  this 
can  be  done  analytically  (section  4.1.  3).  For  more  than  2-  states  we 

require  either  simulation  or  the  approximation  technique  of  using  (tt  ) 

J  ex 

for  E(7r.),  discussed  in  sections  4.  1 . 6  and  4.  1 . 7.  If  the  transient 
J 

behavior  is  important,  we  obtain  the  multi-step  transition  probabilities, 

<j>  (n),  by  the  methods  of  section  4.4. 
ij  _ 

Box  3  is  concerned  with  the  use  of  the  E(7r.)'s  or  <t\  .(n)'s  to 
answer  various  statistical  decision  questions  (as  discussed  at  length 
in  the  present  chapter).  Usually  we  would  first  evaluate  the  expected 
net  revenues  under  various  policies,  then  select  the  policy  that  maxi¬ 
mizes  the  expected  net  revenue.  Several  other  items  can  be  obtained 
including  the  optimum  number  of  observations  to  take  and  the  expected 
value  of  perfect  information  about  one  or  more  of  the  unknown  transition 
probabilities. 

Having  decided  on  the  policy  to  use,  we  either  terminate  the 
decision  process  or  allow  the  Markov  process  to  run  for  one  or  more 
transitions  as  noted  in  box  4.  When  transitions  are  observed,  modifica¬ 
tion  of  the  prior  distributions  through  the  use  of  Bayes1  rule  is  extremely 
simple  because  of  their  form  (multidimensional  Betas).  As  illustrated 
in  section  3.4,  the  posterior  distributions  are  again  multidimensional 
Betas,  only  the  parameters  have  been  modified. 

Now,  because  the  transition  probabilities  are  still  multidimen¬ 
sional  Beta  distributed,  we  are  justified  in  drawing  the  feedback  (6)  to 
box  2.  The  cycle  is  complete.  We  are  again  ready  to  calculate  either 

the  E(7r.)'s  or  the  4>..(n)'s  by  the  exact  same  method,  then  move  onto 
box  3  for  new  statistical  decisions,  etc. 

Hence,  utilization  of  multidimensional  Beta  prior  distributions 
over  the  transition  probabilities  has  enabled  us  to  place  Markov  proc¬ 
esses  into  a  most  convenient  statistical  decision  framework. 
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It  should  be  mentioned  that  besides  their  uses  in  the  decision 
problems  considered  in  this  chapter,  the  expected  rewards  in  various 
periods  also  allow  direct  comparison  of  the  future  vaiues  of  several 
possible  Markov  processes  if  such  a  comparison  is  of  interest. 
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CHAPTER  6 


A  BRIEF  LOOK  AT  CONTINUOUS  TIME  PROCESSES 


A  continuous  time  Markov  process  is  defined  as  follows.  Given 

that  the  process  is  now  in  state  i,  in  the  next  time  interval  dt  it  will 

make  a  transition  to  state  j  with  probability  a  .  dt(i^j).  (This  implies 

that  the  state  occupancy  times  are  exponentially  distributed.)  For 

small  enough  dt  the  probability  of  two  or  more  transitions  is  assumed 

to  be  zero.  The  quantity  a_(i^j)  is  called  the  transition  rate  from 

state  i  to  state  j  .  The  process  can  be  completely  described  by  a 

matrix  A  =  (.a..)  where  we  define 
U 


j*1 

The  primary  objective  of  this  chapter  is  to  show  that  essentially 
the  same  approach  as  was  used  to  incorporate  uncertainty  as  to  the 
values  of  the  transition  probabilities  in  a  discrete  time  process  can 
also  be  utilized  to  take  into  account  uncertainty  as  to  the  values  of  the 
transition  rates  in  a  continuous  time  process.  Consequently,  the  treat¬ 
ment  will  be  rather  brief;  in  fact,  we  shall  only  look  at  the  determina¬ 
tion  of  the  expected  values  of  the  steady  state  probabilities  in  a  2-state 
process  and  then  make  some  general  remarks  about  other  aspects  of  the 
continuous  time  problem. 

6.  1.  The  Use  of  Gamma  Prior  Distributions  on  the  Transition  Rates 
of  a  2-State  Process 

6.1.1.  Justification  for  the  Use  of  the  Gamma  Priors 


Consider  one  of  the  states.  As  was  mentioned  earlier,  the 
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occupancy  times  within  that  state  are  exponentially  distributed, 
precisely,  let  t.  be  the  occupancy  time  of  the  state  considered.  Then, 

-a  t 

oo  .  _ 

0  <  t 

o 

i-  ij 


t.  a. 


(t. 


where  a.,  is  the  transition  rate  from  the  state  considered  to  the  other 
ij 

state.  But  this  is  the  special  case  of  the  Gamma  distribution 

a  a- 1 

at  -at 

f  ..(t  |  a  )  =  f  (t  |  a,  a  )  =  JU! -  e  °  °  0  St  (6.1) 

t  P  o'  o'  y'  o'  o'  .  o  v  ' 

1  (a) 

Cl 

when  a  =  1.  Now,  as  shown  in  section  S.  1  of  Appendix  S,  it  is  con¬ 
venient  for  Bayes  calculations  to  place  a  Gamma  prior  on  a .  when  the 
distribution  of  t  is  as  in  equation  (6.  1)  and  a  is  known  exactly.  Hence, 
it  is  advisable  to  use  the  Gamma  prior  on  a.,  in  the  2- state  problem; 
i.  e.  ,  we  choose 

fa  M  =fJ*|v1,w  ) 

12  r 


and 


(6.2) 


fa  M=fJylv2’  W2* 
a21  7  Z  * 


6.1.2.  Bayes  Modification  of  the  Gamma  Prior 

Let  E^  be  the  event  that  in  k^  occupancies  of  state  1  the  total 
occupancy  time  is  T^.  Then,  using  the  results  of  section  S.  3  of 
Appendix  S  we  know  that 
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fai  2(xIE1)  =  yxlvi+ki-  W1+T]  ); 


a  new  Gamma  distribution.  Similarly,  if  represents  the  event  that 
in  k  occupancies  of  state  2  the  total  occupancy  time  is  T  ,  then 

w  (L 

=  VylV^’W 


6.1.3.  Selection  of  the  A  Priori  Parameters 


For 


-a  t 

f  .  (t  |  a  )  =  a  e  °°  0  <  t 

tlla12  000 


and 


f  (a)  =  f  <a  v  ,w  > 
a12  o  7  o  1  1 


it  is  shown  in  section  S.  2  of  Appendix  S  that  the  marginal  distribution 
of  tj  is 


and 


1 

V1W1 

ft  (to>  '  v  +1 

(to+W1) 


0  <  t 

o 


Edj) 


w 


1 

-  1 


and 
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V 

t 


1 


w 


2 

1 


(Vl-1)2  (vr2) 


>  3 


Using  these  last  two  equations  we  can  select  and  w^  so  as 

v 

to  satisfy  our  prior  estimates  of  E(t^)  and  t^ .  Similar  reasoning  would 
hold  for  the  evaluation  of  the  two  parameters  of  the  prior  Gamma  distri¬ 
bution  on  the  other  transition  rate,  a^. 


6.  2.  Determination  of  the  Expected  Values  of  the  Steady  State  Proba¬ 
bilities  of  a  2 -State  Process  When  the  Two  Transition  Rates  are 
Independently  Gamma  Distributed 

Consider  the  2-state  process  with  transition  rate  matrix 


where 


f 

a 


(r)  =  f <r |v  w  ) 

12 


and 


(6.3) 


f  (s)=f(s|v.w) 

a21  y  *  * 


For  a^  and  a^  exactly  known  it  can  easily  be  shown  that  the 
steady  state  probabilities  are  given  by 
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and 


(6.4) 


It 


1 


a12  + 


21 


*2  = 


12 


a!2+a21 


However,  as  a ^  and  a.^  are  random  variables,  it ^  and  it  are  now 
random  variables.  Again,  the  following  mechanism  will  be  used.  We 
randomly  select  an  (aJ2,  a^)  pair  from  their  distributions  and  deter¬ 
mine  the  associated  exactly  known  steady  state  probabilities.  If  this 
is  repeated  a  large  number  of  times,  we  would  like  to  know  the  expected 
values  of  it  ^  and  Interestingly  enough  we  can  evaluate  the  entire 

density  functions  as  well  as  the  expected  values. 

If  and  it ^  are  as  defined  in  equation  (6.4)  and  a^2  and  a 
are  distributed  as  in  equation  (6.  3),  then  using  the  theory  of  derived 
distributions  the  density  functions  of  it ^  and  it  ^  are 


V1  V2 
W1  W2 

f  <x>  =  m —  \ 
*1  P(vl ' v2} 


v-1  v  -1 

X  c  (1-x) 


[^1  +  (w2-Wj)x] 


Vl+V2 


0  £  x  <  1 


and 


(6.5) 


V1  V2 
W1  W2 


V  '  P<v1-v2) 


f  (z)  = 


V  -i  V  -1 

Z  (1  -z)  c 


[w2  +  (Wj  -w2)z]  1 


v.+v. 


0  <  z  <  1 


For  Wj  =  w2  these  are  seen  to  reduce  to  f^(x  \v^,  )  and 

£^1^,^).  respectively,  a  well  known  result.  In  that  case  we  can 
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directly  state  that 


E(V  = - T— 

1  V1  +  v2 


E^’ =  — 


However,  if  we  force  the  q's  to  lbe  equal,  we  lose  one  degree  of  free¬ 
dom  and  we  then  have  only  3  a  priori  parameters  with  which  to  satisfy 
the  prior  means  and  variances  of  the  occupancy  times  (4  quantities). 

If  w^  >  w^,  from  equation  (6.5) 


w,  w  r 
E<* .)  =-rr — — , 

1  ^  V 1 ’ V2  ^ 


V  V 
12  1 

’  w,  r  v->  V1  _1 

- — ,  x  Z(l-x)  dx 

VI  ’  V2  - - 77 - rri  2 

0  [w1  +  (w2-w1)xj 


w 

■p(vv2) 


w  \V2  V,  r(v,+v,+  l)  r,1  v  +1-1  V  -1 


I  '\  Z  v2  r<v1+v2+l)  p 

V  /  W  r(vi}  r(vzn)  \ 


2  (1-x)  1 


1 


•{y1+v2) 


and  using  equation  (G.  2)  of  Appendix  G,  which  defines  the  hyper  geometric 
function  F, 
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E( 


,  ( W2 
V  =  It 


2'  — A-  E[v  +v  ,  V  +1 

v,+v,  1122 


\"l/  12 


vl+v2+1  1 


wi  / 


|  convergent  because  >  w^  j  (6.6) 


Then 


E(tt2)  =  1  -  E(?r1). 


If  >  w^ ,  we  evaluate  E(n^)  first  by 


/  v  1 

/  W1  \  V1  / 

E(7r  )  =1 —  — —  F  V+V  ,V 

2  \w  J  vL+v2  ^1  2  1 


+1 


v,+v,+l 

1  2 

”2/ 

(6.7) 


and  then  use  E^)  =  1  - 


6.3.  General  Remarks 

We  now  have  precisely  the  same  framework  as  in  the  2-state 
discrete  case  except  the  two  transition  rates  are  independently  Gamma 
distributed,  whereas  the  two  transition  probabilities  were  independently 
Beta  distributed.  Hence,  we  can  determine  many  of  the  quantities  that 
were  obtained  in  the  discrete  case  —  e.g.,  the  expected  revenue  per  unit 
time  in  the  steady  state,  the  worthwhileness  (preposterior  analysis)  of 
observing  several  transitions,  the  value  of  knowing  a  transition  rate 
exactly,  etc.  Also,  for  more  than  2  states  the  same  analytic  difficulties 
as  those  in  the  discrete  situation  would  be  encountered. 

It  should  be  noted  that  in  the  discrete  case  the  hyper  geometric 
function  was  required  for  the  situation  where  only  one  transition  proba  • 
bility  was  not  known  exactly;  we  did  not  obtain  a  closed  form  solution  for 
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the  expected  values  of  the  steady  state  probabilities  (other  than  a  com¬ 
plicated  summation)  for  the  situation  where  the  two  transition  probabili¬ 
ties  were  independently  Beta  distributed.  However,  in  the  continuous 
case  use  of  the  hypergeometric  function  has  enabled  us  to  obtain  simple 
closed  form  expressions  for  the  expected  values  of  the  steady  state 
probabilities  when  both  transition  rates  are  independently  Gamma  dis¬ 
tributed. 


SECTION  II 


TRANSITION  PROBABILITIES  EXACTLY  KNOWN  BUT  REWARDS 
ARE  NOW  RANDOM  VARIABLES 


In  Section  I  it  was  assumed  that  the  rewards, were  exactly  known 
but  that  the  transition  probabilities  (or  rates  or  matrices)  were  ran¬ 
dom  variables.  Now,  we  consider  the  completely  opposite  situation; 
the  transition  probabilities  are  assumed  exactly  known  but  the  rewards 
are  now  random  variables.  This  new  situation  turns  out  to  be  easier  to 
handle  in  most  respects  than  the  ones  considered  in  Section  I. 

The  first  step  is  to  develop  convenient  prior  distributions  to 
use  on  the  rewards,  convenient  in  the  sense  that  they  allow  easy  Bayes 
modification  and  simple  determination  of  the  expected  values  of  the 
rewards,  and  their  ranges  satisfy  the  physical  constraints.  Also,  the 
actual  Bayes  modification  of  a  prior  distribution  on  a  reward  after 
several  sample  rewards  have  been  observed  is  demonstrated  for  two 
different  prior  distributions.  Then  we  consider  the  expected  rewards 
in  steady  state  and  transient  situations,  very  important  quantities  for 
statistical  decision  purposes.  Knowing  howto  evaluate  these  quantities 
for  appropriate  prior  distributions  on  the  rewards  we  next  deal  with 
typical  statistical  decision  problems  based  on  either  steady  state  or 
transient  situations. 
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CHAPTER  7 


CONVENIENT  PRIOR  DISTRIBUTIONS  TO  USE  ON  THE  REWARDS 


As  was  the  case  for  the  E(fl\)'s  in  Section  I,  it  will  be  dem¬ 
onstrated  in  Chapter  8  that  the  expected  values  of  the  rewards,  E(r..)'s, 
are  the  critical  quantities  to  know  for  many  Markov  decision  purposes. 
(r„  =  reward  per  transition  from  state  i  to  state  j. )  Hence,  it  is 

important  that  a  prior  distribution  on  an  r..  allows  easy  determination 

ij 

of  E(i_).  As  in  Section  I,  the  other  two  required  properties  of  the 

prior  distribution  on  an  r..  are: 

U 

i)  The  distribution  must  be  convenient  for  Bayes  calculations. 

Ideally,  after  some  sample  values  of  r. .  are  observed  we  would  like  to 

obtain  a  posterior  distribution  on  r..  which  is  a  member  of  the  same 

ij 

family  as  the  prior  distribution. 

ii)  The  range  of  the  distribution  must  satisfy  actual  physical 
constraints.  Two  physical  situations  will  be  considered  in  detail;  the 
first  where  a  reward  can  lie  anywhere  between  0  and  °o,  the  second 
where  the  range  is  -°°  to  ».  A  third  situation  where  the  range  is 
finite,  will  be  briefly  mentioned. 


7.1.  The  Range  of  "r  "  is  (0,°°)  —  Exponential-Gamma  Form 

7.  1.  1.  Determination  of  the  Form  of  the 
Prior  Distribution 

The  probability  density  function  of  a  random  variable  having 
this  range  can  often  be  adequately  described  by  an  exponential  distribu¬ 
tion  whose  parameter  is  in  turn  a  random  variable.  That  is, 


f 

r 


\  (r 
A  o 


-A  r 

.  o  o 
A  e 


0  <  r 

o 


(7.1) 
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where  A,  in  tarn,  has  a  density  function  f  (X  ).  This  latter  density 

A  O 

function  is  the  one  that  must  be  chosen  so  as  to  satisfy  the  require¬ 
ments  of  easy  determination  of  E(r)  and  simple  Bayes  modification. 

It  should  be  recognized  that  a  Gamma  distribution  on  r  given  X 
would  allow  greater  flexibility  than  the  exponential  distribution  of  equa¬ 
tion  (7.  1).  However,  as  demonstrated  in  Appendix  S,  use  of  a 
Gamma  distribution  instead  of  an  exponential  leads  to  serious  difficulties 
in  assigning  the  prior  parameters  of  the  distribution  of  X.  Hence,  we 
sacrifice  some  flexibility  in  order  to  simplify  the  assignment  of  the 
prior  parameters. 

The  situation  here  is  identical  to  that  considered  in  Chapter  6 
where  we  had  exponential  holding  times  whose  parameters  (the  transi¬ 
tion  rates)  were  in  turn  random  variables.  There,  with  the  use  of 
Appendix  S,  we  found  that  it  was  most  convenient  to  have  each  transi¬ 
tion  rate  Gamma  distributed.  Hence,  in  the  present  context  the  X 
parameter  of  equation  (7.  1)  should  be  Gamma  distributed.  That,  is, 

v  .  -wX 

iAX  )  =  f  (A  |v,w)  =  ~ .  XV'  e  °  0<A  (7.2) 

A  o  7  o  1  (v)  o  o 

7.1.2.  The  Marginal  Distribution  of  "r" 
and  its  Mean  and  Variance 


Section  S.  2  of  Appendix  S  shows  that  for  the  conditions  of 
equations  (7.1)  and  (7.2),  the  marginal  density  function  of  r  is 


f  (r  ) 
r  o 


■f 


■A r  X  ) 
A  o  o 


A  o 


dA 


i  ,  •  v-l  1 
(r  +w ) 
o 


0  <  r 

o 


(7.3) 
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The  expected  value  of  r  is 


E(r)  = 


-  1 


for  v  >  2 


and  the  variance  is 


vw 


(v-1)  (v-2) 


for  v  ^  3 


7.1.  3.  Determination  of  the  A  Priori 


Parameters 


(7.4) 


(7.5) 


Using  equations  (7. 4)  and  (7.  5)  we  can  select  v  and  w  so  as  to 

y 

satisfy  our  prior  estimates  of  E(r)  and  r.  These  latter  two  marginal 
quantities  are  easier  to  estimate  a  priori  than  the  moments  of  the 
parameter  X  itselt.  The  reason  that  this  is  mentioned  is  that  the  mean 
and  variance  of  X  can  be  expressed  in  terms  of  v  and  w  and  an  alter¬ 
nate  method  of  obtaining  values  for  v  and  w  would  be  to  select  the 
values  such  that  the  estimates  of  the  mean  and  variance  of  X  were 
satisfied. 


7.1.4.  Bayes  Modification  of  the  Gamma 
Prior  Distribution 


Consider  the  variable  r  with 


where 


f 

r 


X(ro 


X  )  =  X  e 
o  o 


-X  r 
o  o 


0  <  r 

o 


f.  (X  )  =  f  (X  I  v,  w) 
A  o  y  o 
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Let  E  represent  the  event  that  k  independent  values  of  r  sum  to  T  . 

k 

Section  S.  3  of  Appendix  S  illustrates  that  use  of  Bayes1  rule  gives  the 
following  a  posteriori  distribution  on  X 

f.(X JE)  =  f  (X  | v+k,  w+T  ) 

Ao  y  o  k 

which  is  another  member  of  the  same  Gamma  family,  precisely  what 
was  wanted.  Furthermore,  the  a  posteriori  marginal  distribution  of  r 
is  of  the  same  form  as  its  a  priori  distribution.  Using  the  results  of 
section  7.1.2,  there  follows 


f  (r  |E) 
r  o 


v+k 

(v+k)(w+Tk)VTK 

.  .v+k+  1 

(r  +w+T.  ) 
o  ic 


0  <  r 

o 


and 


w  +  T 

E(r'E)-7Tirr-i 

Hence,  the  expected  value  of  r  on  the  next  draw  is  still  extremely 
easy  to  calculate  after  we  have  observed  the  event  E;  this  was  another 
desired  consequence  of  the  form  of  distribution  placed  on  X  . 

7.2.  The  Range  of  "r"is  (-  00  ,  °o  )  —  Normal -Normal  Form 

7.2,1,  Determination  of  the  Form  of  the 
Prior  Distribution 

For  this  range  the  logical  choice  for  a  density  function  for  r 
is  the  Normal  distribution.  We  could  be  quite  general  and  allow  both 
parameters  of  the  Normal  to  be  random  variables.  However,  this 
causes  the  calculations  to  become  quite  involved  (but  they  can  be  carried 


-139- 


out).  Instead,  attention  will  be  restricted  to  the  case  where  the  variance 
of  the  Normal  is  assumed  exactly  known  but  the  mean  is  considered  to 
be  a  random  variable.  In  other  words, 


f 

r 


H(roK) 


WroK*Z) 


-(ro_tlo)2/(2<r2) 


-oo  <  r  <  oo 
o 


(7.6) 


and  we  have  a  density  function  f  ( |jl  ).  Again,  this  latter  density  func- 

h  ^ 

tion  must  be  selected  so  as  to  satisfy  the  requirements  of  easy  determi¬ 
nation  of  E(r)  and  simple  Bayes  modification. 

Appendix  T  reveals  that  the  proper  choice  is 


f 


-  yO  v> w  > 


l  -(Po-v')2/(2  Wo-2) 

—  e 

JTtTwo- 


-00  <  p  <00 


(7.7) 


That  is,  the  mean  should,  itself,  be  normally  distributed. 

7.2.2.  The  Marginal  Distribution  of  "r  " 
and  its  Mean  ^nc(  Variance 

Section  T.  2  of  Appendix  T  shows  that  for  the  conditions  of 
equ-tions  (7.6)  and  (7.7)  the  marginal  density  function  of  r  is 

00 

f  (r  )  =  (  f  |  (r  |p  )  f  (p  )  dp 
r  o  J  r|n  olro  p  ro  ro 

-  00 


which  leads  to 

f(ro}  =  VrolV'’(W+1)‘r2) 


(7.8) 
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Hence,  r  is  normally  distributed  with  mean  V  and  variance  (w+l)<r 


.  ’ .  E(r)  =  v 


(7.9) 


and 


v  2 

r  =  (w+l)o- 

7.2.  3.  Determination  of  the  A  Priori 
Parameters 


(7.  10) 


Using  equations  (7.9)  and  ( 7 . 1 P )  we  can.  select  v  and  w  so  as 

y 

to  satisfy  our  prior  estimates  of  E(r)  and  r.  This  is  possible  because 
2 

cr  is  assumed  exactly  known. 

7.2.4.  Bayes  Modification  of  the  Normal 
Prior  Distribution 


Consider  the  random  variable  r  with 


f 

r 


^  =  fN(ro 


2, 

U  .  ) 


where,  in  turn, 


w°w 


2. 

Wcr  ) 


Let  E  represent  the  event  that  k  independent  values  of  r  sum 
to  T^.  Section  T.  3  of  Appendix  T  shows  that  use  of  Bayes'  rule  gives 
the  following  a  posteriori  distribution  on  p. 


r+WTk  w  2 

WE)  =  fNl%!T^I  'io^Ti0' 
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which  is  another  member  of  the  same  Normal  family.  Furthermore, 
the  a  posteriori  marginal  distribution  of  r  is  again  a  member  of  the 
Normal  family.  Using  the  results  of  section  7.2.2,  there  follows 


f  (r  |E)  =  f. .  r 
r  o  N  o 


V+WT 


k  W+1 


w 

kw  +  l 


and 


v  +  'WT 

_ k 

k  w  +  1 

Thus,  aNormal  distribution  on  r  with  its  variance  exactly 
known  but  its  mean,  in  turn,  normally  distributed,  allows  us  to 
obtain  E(r)  easily  and  also  perform  simple  Bayes  modifications  when 
some  sample  values  of  r  are  observed. 


E(r  |  E) 


7.  3.  The  Range  of  "r"  is  Finite 

In  most  cases  a  finite  range  on  r  can  be  adequately  approxi¬ 
mated  by  either  an  Exponential -Qamma  or  Normal -Normal  framework 
with  suitably  chosen  parameter  values.  Another  possible  method  of 
handling  this  situation  would  be  the  following: 

Suppose  the  allowable  range  of  r  is  from  A  to  B.  Then,  we 
could  say  that 


f  I  (r  |  m  ,  n  )  = 
r  m,  no  o  o 


1 


m  +n  - 1 

(B-A)  °  °  (3(m  ,  n  ) 

o  o 


m  -1  n  -1 

(r  -A)  °  (B-r  )  ° 

o  o 


A  <  r  <  B 
o 

(The  Beta  distribution  with  arbitrary  limits) 
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where  m  and/or  n  would,  in  turn,  be  random  variables  having  con¬ 
venient  distributions.  Unfortunately,  it  appears  that  there  are  no  such 
convenient  distributions.  Hence,  it  seems  that  we  must  resort  to  one 
of  the  aforementioned  approximation  methods  when  the  range  of  r  is 
finite . 


CHAPTER  8 


DETERMINATION  OF  THE  EXPECTED  REWARDS  IN 
VARIOUS  TIME  PERIODS 


As  was  explained  in  Chapter  5  the  expected  rewards  in 
various  time  periods  are  of  central  importance  in  decision  making  in 
Markov  processes.  Hence,  it  is  imperative  that  we  be  able  to  evaluate 
these  expected  rewards. 

8.1.  The  Expected  Reward  per  Period  in  the  Steady  State 

8.1.1.  Arbitrary  Distribution  on  r.. 

ij 

Consider  an  N  state  Markov  process  with  exactly  known  transi¬ 
tion  matrix,  P  =  (p  ),  an<*  *et  rij  be  t'le  reward  per  transition  from 
state  i  to  state  j  (i,  j=l,  2,  .  .  .  ,  N).  The  r./s  are  random  variables 
rather  than  being  exactly  known. 

When  all  the  r,.'s  are  exactly  known,  the  expected  reward  per 
U 

transition  or  per  period  (we  assume  throughout  that  exactly  one  transi¬ 
tion  occurs  per  period)  in  the  steady  state  is 

N  JN 

R.  =  )  ?r.  )  p. .  r.. 

t  L  1  U  1J  1J 

i=l  j=l 

where,  again,  7r.  is  the  steady  state  probability  of  being  in  state  i.  How¬ 
ever,  the  r..'s  are  random  variables;  still,  we  can  write 
ij 
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pij  E,v 


(8.  1) 


N 


P. .r . . 
ij  iJi 


N 


l 


i=l 


j=l 


Two  different  mechanisms  will  produce  this  result: 

i)  As  in  earlier  situations  we  select  r  .  values  from  their 

ij 

distributions  and  let  the  process  run  to  the  steady  state  obtaining 
an  R^.  This  is  repeated  a  large  number  of  times  and  the  long  term 
average  of  is  as  given  in  equation  (8.  1). 

ii)  We  let  the  process  run  to  the  steady  state  just  once  and 
everytime  a  transition  occurs  from  state  i  to  state  j  we  draw  an  r. 
value  from  its  (unchanged)  distribution.  The  long  term  expected 
reward  per  transition  is  again  given  by  equation  (8.1). 

If,  instead  of  rewards  for  transitions  we  used  rewards  for 
being  in  states,  equation  (8.  1)  would  be  replaced  by 


N 

E(R)  =  y  T  E(r.) 
i=l 


where  E(R)  is  the  expected  reward  per  period  in  the  steady  state 
and  r.  is  the  reward  per  period  for  being  in  state  i  (i  =  l  ,  2,  .  .  .  ,  N). 

8.1.2.  Exponential -Gamma  Distribution 
on  r . . . 

_ ll 

-A  r 

°  °  0  <  r 

o 

lJ  iJ 


(r  A  )  =.  A 

.  0  0  i 


and 


(8.2) 
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(X  )  =  f  (X  v . . ,  w . . )  = 
o  y  o  ij  ij 


r(v.) 

ij 


-w.  .X 


ij  o 


0  <  X 

o 


Then,  as  shown  in  section  S.  2  of  Appendix  S 


f  (r 
r  . 

U 


v,  .  w.  . 
ij  U 


v.  .  +  1 

(r  +w.  .)  1 
O  U 


o  —  r 

o 


and 


E(r .  .) 
U 


w.  . 
ij 


v.  .  >  2 


Therefore,  substituting  in  equation  (8.1) 


N  N 

E<V  -l  \  l 

i=l  J=1 


Pi.i 


w.  . 
ij 

v..  -  1 
!J 


(8.3) 


which  is  the  expected  reward  per  transition  in  the  steady  state  when 
the  rewards  have  Exponential -Gamma  prior  distributions  with  para¬ 
meters  v..  and  w  ..  Unlike  in  section  I  where  the  p..'s  were  not 
hi  U  U 

exactly  known,  here  we  have  an  easily  computable  expression  for 

the  E(R  )  of  an  N  state  process. 

8.1.3.  Normal -Normal  Distribution  on  r.. 

_ y. 


f 

r .  . 
U 


fN(ro 


1 


-(r  -P  )2/(2o-2  ) 

o  o  lj 


J  2t  <r. 


U 


-  oo  <  r  <  co 
o 
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and 


K] 


v. . ,  w.  .cr^.) 
iJ  iJ  iJ 


Then,  as  shown  in  section  T.2  of  Appendix  T, 


f  (r 
r .  . 
iJ 


) 

o 


=  Vro 


V.  (w.  .+1  )<r2.) 
lj  !J  IJ 


and 


E(r  .)  -  v.. 
iJ  iJ 


Therefore,  substitution  in  equation  (8.1)  yields 


E(R)  = 


N  N 


I  ’i  1 


p. .  v.. 
iJ  »J 


(8.4) 


which  is  the  expected  reward  per  transition  in  the  steady  state  when  the 

rewards  have  Normal -Nor mal  prior  distributions  with  parameters  v.. 

iJ 

and  w...  Again,  E(R^)  is  an  easily  computable  quantity. 

8.1.4.  The  Effects  on  E(R^)  of  Sample 
Values  of  the  Rewards 


Suppose  we  observe  several  transitions  with  the  associated 


rewards . 


Let  f  .  =  the  number  of  observed  transitions  from  state  i  to 
iJ 

state  j  (i,  j=l ,  2,  .  .  .  ,  N) ,  F  =  (f  .)  and  s  .  =  the  total  reward  from  the  f. 

iJ  iJ  iJ 

transitions;  S  =  (s..). 

iJ 

i)  For  the  Exponential -Gamma  framework  with  prior  parameters 
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v  and  w  .  we  have  seen  in  section  7.1.4  that 
ij  iJ 


E(r  .|f.. ,s..) 
1J  !J  1J 


w. .  +  s. . 


V..  +  f.  .  -  1 

U  IJ 


Therefore, 


N  N 

E(R|F,5)  ‘l  l 

i=l  j=l 


w.  .  +  s.  . 

ij  u 

ij  v.  .  +  f.  .  -  1 
ij  ij 


(8.5) 


ii)  For  the  Normal-Normal  framework  with  prior  parameters 

v  and  w.  section  7.  2.  4  showed  that 
»J  ij 


E(r . .  f.  . 
ij  ij 


s.  .) 
»J 


V.  .  +  w  s  . 

H  ij  *J 

f.  .W.  .  +  1 

IJ  IJ 


Therefore, 


E(R  |F,  S) 


I 


^  v.  .  +  w.  s.  . 

3T.  ^  P..  1J  1J  1J 

1  /,  Plj  - 

j=l  f.  .w.  .  +  1 

1 J  IJ 


(8.6) 


Hence,  both  frameworks  allow  us  to  easily  calculate  E(R^) 
after  several  observations  have  been  made,  a  very  desirable  property 
for  statistical  decision  purposes. 


8.2.  The  Expected  Rewards  in  Transient  Situations 

We  shall  restrict  attention  to  the  situation  where  the  rewards 
are  given  for  being  in  states.  For  exactly  known  rewards  let  R.(n)  be 
the  expected  reward  in  n  periods  given  that  we  start  in  state  i.  Then, 
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N 

R.(n)  =  )  t.  (n)  r. 
i  L  iJ  J 

j=l 

where  r  is  the  reward  per  period  for  being  in  state  j  (j  =  1 , 2,  .  .  .  ,  N) 
and  t  (n)  is  the  expected  number  of  times  that  the  process  will  be  in 
state  j  during  the  next  n  periods  given  that  the  present  state  is  i. 

The  quantity,  t  .(n),  can  be  obtained  by  transform  techniques  as  shown 

^  J  2 1 

in  Howard's  Dynamic  Probabilistic  Systems. 

Since  the  r.'s  are  random  variables,  R  (n)  is  now  a  random 
J  i 

variable  whose  expected  value  is 


E[R.(n)]  =  E 


N 

I 

Lj=l 


t.  .(n)  r  . 
U  J 


N 

E[R. (n) ]  =  \  7  (n)  E(r  .)  (8.7) 

1  Lj  lj  J 

where  the  quantities  are  defined  above.  This  formulation  assumes  the 
following  mechanism.  For  each  j  we  select,  an  r.  and  run  the  process 
(starting  from  state  i)  for  n  periods.  If  this  is  repeated  a  large  num¬ 
ber  of  times  (selecting  new  r.'s  each  time)  the  average  reward  for 
the  n  periods  approaches  E[R.(n)j.  Note  that  the  r  distributions  are 
not  updated  as  we  progress  through  the  n  periods.  Furthermore,  it  is 


21 


Op 


cit. 
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interesting  that  we  use  here  the  same  E(r..)  quantities  that  were 
required  under  steady  state  conditions  in  section  8.1. 

Numerical  Example 


Mike,  the  waiter,  who  is  in  financial  trouble,  is  concerned  about 
the  amount  of  money  he  can  expect  to  make  on  tips  from  the  next  n  cus¬ 
tomers  that  he  serves.  Mike  is  mathematically  inclined  but  rather 
moody.  He  figures  that  all  customers  can  be  split  into  two  groups,  good 
and  bad.  In  fact,  Mike  reasons  that  a  good  customer  will  leave  a  tip 
that  is  approximately  normally  distributed  with  variance  0.2,  and  a 
mean  that  is.  in  turn,  normally  distributed  with  mean  2  and  variance  0.  1. 
The  corresponding  quantities  for  a  bad  customer  are  0.4,  1  and  0.  3. 

Mike  s  moodiness  is  reflected  by  the  fact  that  his  transitions  from  one 
customer  type  to  another  can  be  represented  by  the  following  proba¬ 
bilities  . 


G  B 

G  3/4  1/4 

P  = 

B  1/3  2/3 

That  is,  he  is  more  likely  to  stay  with  the  current  type  than  switch.  (A 
good  tip  improves  his  morale  making' his  attitude  better.  This,  in  turn, 
improves  the  chance  of  another  good  tip.) 

Given  that  Mike  is  now  serving  a  good  customer,  what  are  his 
expected  total  tips  in  the  next  n  customers  (including  the  present  one)? 
The  reward  structure  is  given  by 
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=  fN(rol|ir0‘2) 


,  1 

■  •  V1  =  2-  W1  =  2 


fr2(ro}  =fN(rc>2’°'4) 

“VO1’0*3'' 


The  following  flow  graph  immediately  gives  us  the  geometric  trans- 

22 

forms  of  t^(n)  and  t  2(n)  ,  the  expected  numbers  of  good  and  bad 

customers,  respectively,  in  the  next  n  customers. 


© 


The  transmission  from  A  to  B 
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00 


tab<z)  =tnT(z)  =  X  ‘uW*" 

n=0 


2  2 

.  —  T  2  "  3  z 

•  •  ‘ll  <z>  = 


(1  -z>2  ( 1  -Fzz) 


Using  a  partial  fraction  expansion  and  inverting  the  transforms,  we 
obtain 


-  .  .  4  ,36  15/5 

tll(n)  =  7  n  +  49  '  49  (  12 


.  n-1 


n  ^  0 


Similarly,  using  the  transmission  from  to  (c)^  there  results 


7,1  A.  Ak  .  J±  k 
12(n)  7  n  ~  49  49  12 


n-1 


n  Z  0 


Now,  from  section  7.2.2 


E(r L  )  =  E(p1)  --  vL  =  2 


and 

E(r2)  =  E(p2)  -  v2  =  1 
Then  using  equation  (8.  7), 
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2 

E[R1(n)]  =  7  (n)  E(r.) 


=  2 


& 


n  + 


+ 


11  36.  _1J5  / _5_\n_1 

7  n  '  49  '  49  \ 1 2y 


n  >  0 


This  is  his  expected  total  tip  money  from  the 


next  n  customers. 


6 -Si 
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CHAPTER  9 


MARKOV  DECISION  PROBLEMS  WHEN  THE  TRANSITION 
PROBABILITIES  ARE  KNOWN  EXACTLY  BUT  THE 
REWARDS  ARE  RANDOM  VARIABLES 


As  was  done  in  Chapter  5,  we  shall  make  the  realistic  assump¬ 
tion  that,  given  a  choice,  the  decision  maker  will  use  a  Markov  process 
only  if  the  expected  net  revenue  is  positive.  Hence,  in  the  steady  state 
situation  the  process  will  be  utilized  only  if  the  expected  reward  per 
transition  E(R^),  is  greater  than  the  fixed  cost  per  period  for  using  the 
process,  c.  In  the  transient  situation  the  process  will  be  operated  only 
if  the  expected  reward  in  the  n  remaining  periods,  E[R.(n)]  is  greater 
than  the  cost  of  operation,  nc.  This  does  not  imply  that  all  Markov 
decisions  are  made  in  this  manner;  rather  it  is  hoped  that  the  reader 
will  use  the  analysis  presented  here  as  a  guide  for  analyzing  a  situation 
where  the  decision  mechanism  is  different. 

9.1.  Problems  Based  on  Steady  State  Conditions 

Once  more  it  is  assumed  that  the  process  will  be  used  for  only  s 
periods  in  the  steady  state  or  with  discounting  for  an  indefinitely  large 
number  of  periods.  Only  the  first  situation  will  be  analyzed,  the  sec¬ 
ond  involves  a  trivial  extension  as  mentioned  in  section  5.2.3.  Within 
this  framework  there  ’S  a  possibility  that  observations  of  the  process  may 
be  worthwhile . 

9.1.1.  Preposterior  Analysis  for  a  Single  Observation 

Li  E(RJ  is  the  expected  reward  per  period  in  the  steady  sr.ate 
then  because  of  the  assumed  decision  mechanism,  the  expected  net 
revenue  is 
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(9.  1) 


E(N.  R . )  =  max[o,  s[E(R^_)  -  c]  ] 


Suppose  that  the  process  is  in.  state  k  and  we  decide  to  pay  for 

an  obse  rvation  of  the  next  transition  reward.  With  probability  p,  (m- 

km 

1  2,  ....  N)  we  shall  observe  a  draw  from  the  density  function  f  (r  ) 

r  o 
km 

of  the  reward  r,  .  The  probability  that  the  reward  will  be  between  x 
km 

and  x  4-  dx  given  that  the  transition  is  to  state  m  is  clearly  a  function  of 

the  prior  framework  placed  on  r^^;  it  is  given  by  f  (x)  dx.  For 

the  Exponential-Gamma  framework  with  parameters  v?mand  w. ..  it  was 

ij  ij 

shown  in  section  7.1.2  that 


f  (x)  dx  = 
rkm 


km 

_ km  km - . 

v,  41 

lx*..  )  km 

Kir> 


dx 


0  <  x 


(9.2) 


For  the  Normal-Normal  framework  with  parameters  v,  ,  w,  and 
2  km  km 

<r  ,  section  7.  2.  2  revealed  that 
km 


f  (x)  dx  = 
rkm 


fN(x|vkm’(wkm+1)  fkm!  dx 


(9.3) 


Now,  when  a  draw  falling  between  x  and  x  4-  dx  occurs  from 

the  distribution,  only  the  k-m  element  of  E(R^)  is  changed  and. 

again,  the  change,  A  (x),  is  a  function  of  which  framework  is  being 

km 

used.  Recall  from  equation  (8.  1)  that 


N  N 

'*  £  j  ij> 

i=l  j-1 


For  the  Exponential -Gamma  framework  with  parameters 
prior  to  the  observation 


v  ,  and  w 
ij  b 
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1 


(using  equation  (8.  3)) 


km 


E(r,  ) 


km  v 


km 


after  the  observation, 


w,  +  x 
km 


E(r  l,x)  =  - 

km  v. 


(using  equation  (8.  5)) 


km 


&km(x)  *  \  Pkm 


W,  +  X  w, 

km  km 


V  V  -  1 

km  km 


(9.4) 


For  the  Normal -Normal  framework  with  parameters  v..,  w. .  and  cr.. 

lJ  ij  lj 

prior  to  the  observation 


E(r,  )  =  v,  (using  equation  (8.  4)) 

km  km 


after  the  observation, 


v,  +  w,  x 
km  km 

E(r  l,x)  = - — - 

km  w,  +  1 

km 


(using  equation  (8.  6)) 


A  (x)  =  7T  p 
km  k  km 


V,  +  W,  X 

km  km 


,  -  v, 

v/  +1  km 

km 


(9.5) 


The  change,  (x),  in  E(R^)  may  also  cause  a  change  in  the 

expected  net  revenue.  According  to  equation  (9.  1)  prior  to  observation, 

E(N.  R. )  -  max[0  E(R  )] 
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and  after  it 


E(N.R.  [observation  of  x  from  r,  ) 

km 

=  max[0,  E(R  )  +  A  (x)] 
t  km 

Finally,  the  expected  net  revenue  prior  to  the  observation, 
given  that  it  will  be  taken,  is  the  integral  of  the  expected  net  revenue 
given  each  possible  outcome,  weighted  by  the  probability  of  that  out¬ 
come  minus  the  cost  of  the  observation.  That  is, 

E(N.R.  j  observation) 

PkmJ  V  {x)max[°’  E(V+\m(x)]  dx 

,  km 

m=l  x 

where  d  =  the  cost  of  observing  the  reward  for  one  transition  from 

K 

state  k. 

If  E(N.R.  j  observation)  is  greater  than  E(N.R.),  we  buy  the 
observation;  if  not,  we  do  not  buy  it. 

Numerical  Example 

Joe.  a  famous  local  vendor,  has  been  offered  the  opportunity  of 
setting  up  a  concession  stand  in  Boston  Gardens  for  5  consecutive  games 
late  in  the  1963-64  schedule  of  the  Boston  Bruins.  Joe  is  somewhat  con¬ 
cerned  about  any  venture  connected  with  the  Bruins;  hence,  he  has  called 
in  a  lucal  operations  analyst  for  help.  Joe  knows  for  sure  that  his  fixed 
costs  per  game  will  be  $161.  After  considerable  deliberation,  Joe  is 
convinced  that  his  revenues  during  a  particular  contest  will  be  a  function 
of  the  Bruins'  showing  in  both  the  previous  game  and  the  present  game. 


-d  + 
k 


N 

l 
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He  is  satisfied  with  separating  the  Bruins'  behavior  into  "WIN",  "TIE" 
or  "LOSE.  "  Also,  from  past  records  and  forecasts  of  the  coming  sea¬ 
son  Joe  and  the  analyst  are  willing  to  assume  the  transition  probabilities 
exactly  known  at 


w 

T 

L 

0.  1 

0.  3 

0.6 

0.2 

0.6 

0.2 

0.  3 

0.  3 

0.4 

The  corresponding  steady  state  probability  vector  is 


W  T  L 

£=[.214  .429  .357] 


Joe  isn't  so  confident  about  his  revenues  for  various  Bruins' 
showings.  Again,  after  considerable  thought  he  feels  that  the  revenues 
have  Exponential -Gamma  frameworks.  That  is, 


X  )  =  X  e 
o  o 


-X  r 
o  o 


0  <  r 

o 


and 


w..) 

ij 


His  estimates  of  the  parameters  are 


V  =  (v.) 
ij 


2  2 

3  4 

2  4 


2 

5 

6 
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and 


$300 

$300 

$300 

(w..) 

= 

$400 

$400 

$500 

$100 

$300 

$300 

From  equation  (7.4) 

$300 

$300 

$300 

w  . 

[E<V]  *  v.-i  = 

u 

$200 

$400 

3 

$125 

$100 

$100 

$  60 

This  says,  for  example,  that  his  expected  revenue  in  a  game  that  the 
Drums  win  given  that  they  tied  the  previous  game  is  $200  (the  2-1 
element) . 

Joe  will  accept  the  offer  only  if  his  expected  net  revenue  is 

positive . 

Now,  Joe  has  a  friend  who  has  run  a  similar  concession  stand 
at  Boston  Gardens.  Unfortunately,  the  friend  only  has  revenue  data 
for  a  single  contest  and  without  delving  into  his  papers  (for  a  fee)  he  is 
only  willing  to  tell  Joe  that  the  previous  game  ended  in  a  tie.  For  $d^ 
he  will  tell  Joe  his  revenue  and  the  outcome  of  the  corresponding  game. 
Without  his  friend  s  revenue  information,  from  equation  (8.1)  we  have 

E(R  )  =  .  214[(0.  1)  300  +  (0.  3)  300  +(0.  6)  300]  +  .  429[(0.  2)  200  + 

(0.6)  400/3  +  (0.  2)  125]  +  .  357[(0.  3)  100  +  (0.  3)  100  + 

(0.  4)  60] 

=  $156. 40 
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E(R  )  -  c  =  -$4.  60  <  0 

Therefore,  without  his  friend's  information  Joe  would  not  accept  the 
offer  and  his  expected  net  revenue  would  be  zero. 

Now  suppose  that  the  friend's  help  is  accepted.  As  the  transi¬ 
tion  is  from  state  2  (a  tie)  only  the  second  row  of  (E(r_))  will  be  affected. 
Let  us  treat  the  3  possible  transitions  separately. 

Transition  from  2  to  1  (With  Probability  p  =  0.2) 


Let  the  observed  revenue  be  x.  Then,  using  equation  (9.4)  the 
change  in  E(R^)  is  given  by 

(■  429)(0.  2)|40°*~  - 200 j 

or 


. 0286x  -  5. 72 

From  above,  E(R^)  must  increase  by  at  least  4.60  to  make  the  offer  worth¬ 
while  . 

Therefore,  . 0286x  -  5.72  must  be  S4.60.  That  is,’ 
x  —  $ 361 . 

Hence  when  x  2:  $361,  the  Gardens'  offer  becomes  worthwhile  and 

E(N.  R.  )  =  5E(R  |  1,  x)  =  5[01d  E(R  )  +  A  (x)  -  c] 
t  t  1  £ 

=  b[l 56. 40  +  .  0286x  -  5.  72  -  161.00] 

=  5[ -  1 0 . 32  +  .  0  286x] 
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00 


E(N.R.  (transition  from  2  to  1  observed)  =  5 


I 


E(Rt|l,x) 


361 


f  (x)  dx  =  5  f  [-10.  32  +.0286x1  3(4°°--<  dx 

r21  (x+400)4 


(using  equa - 
tion  (7.  3)) 


=  $7.93 


by  straight-forward  integration. 


Similarly, 

E(N.R.  |  transition  from  2  to  2  observed)  =  $12.42 


and 


E(N.  R.  |  transition  from  2  to  3  observed)  =  $1.05 


Now, 


E(N.R.  (transition  from  2  observed) 


•z 

j--l 


p2j  E(N 


R.  [transition  from  2  to  j)  -  d. 


=  0.  2(7.  93)  +  0.  6(1  2.  42)  +  0.  2(1. 05)  -  d£ 
=  $(9.  25-d2) 


As  we  found  that  the  E(N.R.)  without  an  observation  was  zero  Joe  would 
pay  for  his  friend's  information  if  and  only  if 

d2  <  $9.25 
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Furthermore  the  analysis  tells  us  that  Joe  would  accept  Boston's  offer 
only  under  the  following  outcomes  of  the  single  revenue  observed: 

1)  If  the  transition  is  from  2  to  1  (tie  to  win)  and  the  revenue 
is  greater  than  $361. 

n)  If  the  transition  is  from  2  to  2  (tie  to  tie)  and  the  revenue 
is  greater  than  $205. 

or 

iii)  If  the  transition  is  from  2  to  3  (tie  to  lose)  and  the  revenue 
is  greater  than  $393. 

9.1.2,  Preposterior  Analysis  for  More  Than 
One  Observation 


The  method  in  principle  is  the  same  as  that  used  for  the  case 
of  one  observation  discussed  in  the  previous  section.  To  find  the 
expected  net  revenue  given  that  the  observations  will  be  taken,  we 
integrate  the  expected  net  revenue  after  each  possible  outcome  weighted 
by  the  probability  of  that  outcome.  There  is  no  problem  in  evaluating  the 
net  revenue  after  any  particular  experimental  outcome.  Equations  (8.5) 
and  (8.  6)  allow  us  to  find  the  expected  reward  per  period  in  the  steady 
state  after  any  experimental  outcome  for  the  Exponential -Gamma  and 
Normal-Normal  prior  frameworks  respectively.  Then,  we  use 

E(N.  R.  )  -  max[0  s[E(R  )  -  c]  ] 

t.o  obtain  the  corresponding  expected  net  revenue.  The  difficulty  is  that 
the  probability  of  any  particular  experimental  outcome  is  now  an  involved 
function.  This  makes  the  integration  difficult.  More  will  be  said  on  this 
point  in  the  next  section. 

Again-  if  the  expected  net  revenue  given  that  we  shall  buy  the 
observations  is  higher  than  the  expected  net  revenue  without  them,  we 
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buy  the  observations;  if  lower,  we  do  not  make  the  purchase. 


9.1.3.  Some  Other  Remarks  of  Interest 


The  continuous  nature  of  the  random  variables  causes  diffi¬ 
culties  when  a  sequential  form  of  analysis  is  attempted.  This  can  be 
illustrated  bv  a  situation' where  we  can  observe  the  rewards  of  one  or 
two  consecutive  transitions  with  the  option  of  not  observing  the  second 
one  after  we  have  seen  the  first.  For  just  one  transition  remaining,  the 
problem  can  be  analyzed  as  in  the  previous  section.  However,  the  con¬ 
tinuous  nature  of  the  reward  for  the  first  transition  forces  us  to  consider 
an  infinite  number  of  possible  situations  when  just  the  second  transition 
remains.  This  would  be  tractable  if  the  expected  net  revenue  using  an 
observation  was  a  simple  analytic  function  of  the  prior  parameters. 
However  such  is  clearly  not  the  case. 

A  possible  method  of  avoiding  the  above  difficulty  would  be  to 
approximate  each  continuous  reward  distribution  by  a  discrete  (quantized) 
probability  mass  function;  that  is, 

pr(r=xk)  -  Pr<xk)  =  Pk  k  =  1,  2,  .  .  .  ,  n 

But.  in  order  to  allow  Bayes  moditication,  we  would  not  assume  that 
the  Pk's  were  known  exactly;  rather,  we  would  place  a  multidimen¬ 
sional  Beta  distribution  on  them  as  was  done  with  the  multinomial  distri¬ 
bution  in  section  3.1.  Under  this  framework,  each  observation  would 
have  a  finite  number  of  outcomes  and  sequential  analysis  would  thus  be 
possible. 

In  the  single  observation  case  discussed  in  section  9.1.1  we 
could  obtain  the  expected  value  ot  perfect  information  about  one  or  more 
of  the  unknown  parameters  by  ‘ollowing  the  method  outlined  in  section 
5.  4.  1  .  Therefore,  there  is  no  need  to  elaborate  here  on  the  value  of 
perfect  information. 


1(>3 


Q.Z.  Statistical  Decisions  in  Transient  Situations 


In  Chapter  5  it  was  mentioned  that  in  the  case  of  statistical 
decision  problems  when  the  transition  probabilities  were  not  exactly 
known  transient  situations  could  be  handled  in  essentially  the  same 
way  as  the  decision  problems  based  on  steady  state  conditions.  The 
exact  same  statement  can  be  made  for  the  present  situation  where  the 
rewards  are  not  exactly  known. 

The  quantity,  s[E(R^)  -  c]  is  replaced  by  E[R.(n)]  -  nc  •  the 
expected  net  revenue  in  the  next  n  periods  given  that  the  process  will 
be  used.  Naturally  if  we  delay  1  period  (perhaps  to  observe  a  reward) 
before  using  the  process  and  the  system  moves  to  state  j,  the  expected 
net  revenue  will  then  be  E[R^(n-l)]  (n--l)c.  With  these  minor  modifi¬ 

cations  we  are  able  to  analyze  the  single  observation  case  exactly  as  in 
section  9.1.1.  Also,  we  encounter  the  same  predicament  for  a 
sequential  decision  framework  as  in  the  steady  state  situation.  Finally, 
the  expected  value  of  perfect  information  about  one  or  more  of  the  un¬ 
known  parameters  could  be  evaluated  by  the  method  of  section  5.4.1. 
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CHAPTER  10 


CONCLUSIONS 


10.1  Summary 

The  purpose  of  this  study  has  been  to  extend  the  range  of  application 
of  Markov  process  theory  by  removing  one  of  the  fundamental  assumptions 
made  in  earlier  theoretical  considerations  of  such  processes,  namely  that 
both  the  rewards  ard  transition  probabilities  be  known  exactly. 

In  Section  I  we  assumed  that  the  rewards  were  exactly  known  but  the 
transition  probabilities  (  or  matrices  or  rates  )  were  random  variables. 

The  first  approach  involved  the  concept  of  a  multi-matrix  Markov 
process  --  this  process  is  governed  by  one  of  several  known  matrices  but 
we  only  know  probabilistically  which  matrix  is  being  used.  The  analysis  of 
this  situation  is  quite  straightforward  as  was  shown  in  Chapter  2.  We  des¬ 
cribed  the  Bayes  modification  (  after  some  transitions  are  observed  )  of  the 
probabilities  that  the  various  matrices  are  being  used.  The.  determination 
of  various  quantities,  such  as  mean  recurrence  times,  was  illustrated. 

Then  \arious  cost  structures  were  placed  on  multi-matrix  processes. 
Finally,  the  possibility  of  using  statistical  decision  theory  in  multi-matrix 
situations  was  revealed  by  consideration  of  a  2-state,  2-matrix  example 
where  there  is  an  option  of  buying  observations  of  transitions  before  stating 
which  matrix  is  being  used. 

Chapters  3,  4,  and  5  were  concerned  with  the  physically  more  ap¬ 
pealing  situation  where  the  transition  probabilities  themselves  (  rathe”  than 
the  matrices  )  are  considered  as  random  variables.  It  was  £irsf  demon¬ 
strated  that  it  is  convenient  {  for  Bayes  calculations  and  this  for  statistical 
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decision  purposes  )  to  have  a  multidimensional  Beta  prior  distribution  on 
the  probabilities  of  each  row  of  the  transition  matrix.  This  is  a  direct 
extension  of  the  previously  known  result  of  using  a  Beta  prior  on  the  prob¬ 
ability  of  success  in  a  Binomial  process.  Chapter  3  was  concerned  with 
developing  many  basic  properties  of  the  multidimensional  Beta  distribution, 
several  c£  which  were  to  be  utilized  later  in  the  study. 

In  Chapter  4  we  attacked  the  problem  of  determining  various  quantities 
of  interest  when  the  transition  probabilities  are  multidimensional  Beta  dis¬ 
tributed.  It  is  important,  both  for  statistical  decision  purposes  and  for 
interest  in  the  quantities  per  se,  that  we  be  able  to  describe  their  behavior 
when  the  transition  probabilities  are  not  known  exactly.  Unfortunately  analytic 
complexities  prevented  a  detailed  theoretical  treatment  of  the  situation.  The 
one  important  analytic  achievement  for  the  N  state  case  was  the  derivation  of 
the  probability  mass  functions  of  the  state  occupancy  times.  Analytic  results 
were  also  obtained  for  mean  recurrence  times,  multi-step  transition  proba¬ 
bilities  and  expected  steady  state  probabilities  in  2-state  processes.  Finally 
a  simple  trapping  states  situation  was  analyzed. 

It  was  illustrated  that  the  expected  values  of  the  steady  state  proba¬ 
bilities  are  very  important  for  many  decision  purposes.  Hence,  considerable 
time  was  devoted  to  those  quantities.  The  analytic  expression  for  a  special 
2-state  situation  was  made  possible  through  the  use  of  the  hypergeometric 
function.  Due  to  analytic  complexities  simulation  was  required  for  processes 
with  more  than  2  states.  However,  the  results  were  most  encouraging  in  that 
they  suggested  that  a  reasonable  approximation  to  the  expected  values  of  the 
steady  state  probabilities  can  be  obtained  by  assuming  that  the  transition 
probabilities  are  exactly  known  at  their  mean  values  and  then  using  the  cor¬ 
responding,  easily  calculable,  steady  state  probabilities.  This  permits  us 
to  have  the  multidimensional  Beta  priors  on  the  transition  probabilities,  a 
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situation  ideal  for  Bayes  modifications,  yet  wc  can  still  easily  obtain  a 
reasonable  approximation  to  the  expected  steady  state  probabilities,  tne 
latter  being  essential  for  statistical  decision  purposes. 

In  Chapter  5  we  analyzed  several  statistical  decision  situations  for 
Markov  processes  whose  transition  probabilities  are  multidimensional  Beta 
distributed  Again,  it  should  be  emphasized  that  no  claim  is  made  that  the 
situations  considered  cover  most  of  those  that  could  occur  in  practice. 

Rather  the  intent  was  to  provide  the  reader  with  an  approach  to  solving 
certain  types  of  decision  problems  arising  in  Markov  processes  in  the  hope 
that  he  would  be  able  to  make  the  suitable  adjustments  to  fit  the  particular 
physical  situation  confronting  him  Items  such  as  the  expected  value  of 
perfect  information  about  a  transition  probability  were  also  considered.  The 
entire  statistical  decision  framework  established  for  Markov  processes  was 
summarized  in  Figure  5  6. 

Chapter  6  involved  a  brief  look  at  .continuous  time  processes  to  show 
that  essentially  the  same  situation  (  including  analytic  difficulties  )  exists  as 
in  the  discrete  time  case. 

In  Section  II  we  assumed  that  the  transition  probabilities  were  exactly 
known  but  now  the  rewards  were  random  variables.  This  is  the  exact  opposite 
situation  from  that  of  Section  I. 

Chapter  7  illustrated  two  convenient  (  in  the  Bayes  sense  )  prior  dis¬ 
tributions  to  place  over  the  rewards  Then  Chapter  8  was  concerned  with  the 
determination  of  the  expected  rewards  in  various  time  periods  (  both  steady 
state  and  transient  ).  These  quantities  were  much  easier  to  obtain  than  were 
the  corresponding  items  in  Section  I  where  the  transition  probabilities  were 
not  known  exactly.  Chapter  9  illustrated  how  to  use  these  expected  rewards 
in  making  statistical  decisions  concerning  Markov  processes  where  the  rewards 
are  not  known  exactly,  e.  g.  ,  is  it  worthwhile  to  observe  some  sample  rewards 
to  improve  our  knowledge  about  expected  future  rewards  before  deciding 
whether  or  not  to  utilize  a  Markov  process  , 
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10  2  Some  General  Remarks 


There  are  several  remarks  that  do  not  logically  fit  into  the  above 
"summary"  section  or  into  Section  10. 3. 

As  in  all  situations  where  a  Bayes  approach  is  to  be  utilized  the 
analyst  must  avoid  the  pitfall  of  developing  his  prior  probability  distributions 
after  observing  the  data  and  then  obtaining  an  a  posteriori  distribution  through 
the  use  of  the  same  data.  Therefore,  in  using  a  Bayes  approach  to  the  study 
of  Markov  processes  the  prior  probability  distributions  over  the  transition 
probabilities  or  rewards  must  first  be  developed,  then  the  observational  data 
is  used  to  obtain  the  posterior  distributions  through  the  use  of  Bayes'  rule. 

In  many  sections  of  this  study  the  suggested  formulas  or  techniques 
involve  considerable  computation.  Fortunately,  as  in  other  areas  of  applied 
mathematics,  the  use  of  high  speed  digital  computers  makes  these  methods 
practical  whereas  hand  computations  would  be  out  of  the  question. 

Again  it  should  be  stressed  that  the  intent  of  this  study  was  not  to 
solve  all  practical  Markov  problems  where  either  the  transition  probabilities 
or  rewards  are  not  exactly  known;  rather  the  goal  was  to  develop  mathe  ¬ 
matical  expressions  for  some  quantities  of  interest  under  this  situation  and 
to  indicate  by  analyzing  some  specific  examples  that  statistical  decisions 
related  to  Markov  processes  can  and  should  be  made. 

10.  3  Suggested  Related  Areas  for  Further  Research 

It  is  apparent  that  a  fundamental  study  of  this  nature  would  suggest 
several  related  research  topics  of  varying  difficulty. 

In  two  situations  in  this  research  the  lack  of  an  ideal  prior  distri¬ 
bution  restricted  our  progress.  One  restriction  was  of  a  relatively  minor 
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nature,  namely  in  Chapter  7  where  we  could  not  obtain  a  convenient  prior 
that  would  exactly  fit  the  physical  constraint  of  a  reward  having  to  lie  within 
a  finite  range.  However,  the  second  restriction  was  of  a  far  more  serious 
nature.  The  multidimensional  Beta  prior  used  for  the  probabilities  of  a 
single  row  of  the  Markov  transition  matrix  was  ideal  for  Bayes  calculations 
but  was  restrictive  due  to  the  fact  that  it  provided  only  k  prior  parameters 
(for  a  k  state  process  ).  Ideally  we  would  like  2k  -  1  parameters  so  that 
the  prior  estimates  of  the  k  -1  marginal  means  and  the  k  marginal  variances 
of  the  individual  probabilities  could  be  satisfied  The  determination  of  a 
prior  distribution  having  2k  -  1  parameters  yet  still  allowing  simple  Bayes 
modification  would  be  a  significant  extension  to  this  study. 

As  mentioned  in  the  main  text  an  interesting  related  problem  is 
Howard's  policy-iteration  situation  but  with  transition  probabilities  and/or 
rewards  no  longer  exactly  known. 

In  many  Markov  process  applications  probabilities  in  different  rows 
of  the  transition  matrix  are  not  independent  as  assumed  throughout  Chapters 
3  to  5.  An  extreme  situation  would  be  a  process  where  we  know  that  two  rows 
of  the  matrix  are  identical  but  do  not  know  the  exact  values  of  the  elements. 
Situations  like  this  would  require  more  complicated  prior  distributions  than 
the  multidimensional  Betas  used  in  this  study. 

Recently,  significant  research  has  been  done  in  the  area  of  semi- 
Markov  processes,  processes  where  the  transitions  are  governed  by  a  regular 
P  matrix  but  the  transition  times  have  arbitrary  probability  distributions. 

A  natural  extension  of  this  thesis  would  be  to  consider  semi-Markov  processes 
where  the  P  matrix  and/or  the  probability  distributions  of  the  transition  times 
were  not  exactly  known 

This  study  has  not  encompassed  the  situation  where  both  the  transition 
probabilities  and  the  rewards  are  not  exactly  known  An  analysis  of  this  more 
general  problem  would  certainly  be  of  value. 


-169 


As  mentioned  in  Chapter  2  the  multi-matrix  framework  is  really 
a  part  of  the  combination  of  Markov  process  theory  and  game  theory. 
Unfortunately,  the  complexities  of  game  theory  have  made  the  determi¬ 
nation  of  analytic  results  extremely  difficult;  hence,  the  use  of  game 
theory  in  Markov  proce  ssisituations  would  not  appear  imminent. 

As  was  stated  in  the  main  text,  an  important  decision  problem 
not.  considered  in  this  report  is  the  situation  where  once  we  have 
decided  to  use  a  process  we  can  change  our  strategy  (i.e.,  make  further 
decisions)  as  we  observe  the  process  in  operation  during  the  time  in 
which  we  are  using  it.  In  effect,  this  situation  can  be  considered  as  a 
tvpe  of  adaptive  control  problem.  We  are  not  committed  by  a  single 
decision  but  can  adjust  our  actions  as  we  become  more  familiar  with  the 
process . 

Undoubtedly,  the  reader  will  be  able  to  make  additions  to  the 
above  list,  of  suggested  related  areas  for  additional  research.  In  any 
event  it  is  hoped  that  this  study  will  stimulate  further  fundamental  investi¬ 
gations  in  the  theory  of  Markov  processes. 
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APPENDIX  A 


PROOF  THAT  vja'  i)  IS  A  PIECEWISE  LINEAR 
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FUNCTION  OF  THE  COMPONENTS  OF  a' 
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which  is  seen  to  be  a  piecewise  linear  function  of  the  components  of  ji'. 

Now,  v  (a1,  i)  is  either  ”l"  or  "s"  but  we  have  already  shown  that  "s" 

is  piecewise  linear  in  the  components  of  _a'.  Therefore,  by  assuming 

that  v  (.a',  r)  is  piecewise  linear  in  the  components  of  .a',  we  have 
h  - 1 

shown  that  v,  (a/,  r)  is  piecewise  linear  in  the  components  of  a1.  Also. 
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we  have  shown  that  v  (a1,  i)  is  piecewise  linear  in  the  components  of  a'. 
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Hence,  the  proof  that  v  (a',i)  will  always  be  a  piecewise  linear  function 
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of  the  components  of  _q'  has  been  completed  by  induction. 
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APPENDIX  B 


IMPORTANT  PROPERTIES  OF  THE  MULTIDIMENSIONAL 
BETA  DISTRIBUTION 
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B  2  The  Marginal  Distribution  of  and  Its  Moments 

By  the  same  approach  as  that  used  in  section  B.  1  (except  that  all 
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B.3.  Joint  Distributions  and  Covariances 


Proceeding  as  above  cue  obialn 


+,■  r,  (*J  ■  = 


"t;  - '  »au- 1 


PC*,.**,  ^  ni ) 

L*J 


v  K  (l-Xj-Kj 


L*J 


ni’i 


j*u 


wi+h  Xj  +  tfu£l  >  *;  > o  ,  XuZo 

'■«•  fpj,pu  =  ffi  (*s.Xu  I  n:,nUj  .jZ-  nL  ) 


ift-a 


and  cov  (>j,pJ  - 


-M 


6^  + 


J  ^  u 
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APPENDIX  C 


DETERMINATION  OF  THE  PRIOR  PARAMETERS  OF  A 


MULTIDIMENSIONAL  BETA  DISTRIBUTION 


BY  LEAST  SQUARES 


1 


m.  -I 
I 


f  (x  ,  X  ,  .  .  .  ,  X  )  =  — - ;  X 

P1?  P2>  •  •  •  *  Pk  1  2  k  P(mlt  m  ,  .  . . ,  itl  )  1 


m  -1  in,  -1 

2  k 


where 


k 


l 


x.  =  1 

i 


As  stated  in  section  3.  3,  we  wish  to  select  the  parameters  (m.'s) 
such  that  the  prior  mean  values  of  the  p  's;  **  e  •' »  th,e  E{  Pj)  '!S  are  satisfied 
exactly.  This  uses  up  k-1  of  the  parameters,  leaving  only  1  other 
degree  of  freedom.  The  final  parameter  is  to  be  obtained  by  a  least 
squares  fit  to  tne  prior  variances  of  the  Pj's>  i • e •  >  to  the  Pj's' 

Define 


k 


i=l 


From  equations  (3.  6)  and  (3.  7), 
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(C.l) 


E(pj) 


m, 

J 

M 


and 


v 


m.(M-m.) 


M  (M+l) 


(C.2) 


Equation  (C.  1)  gives 


m.  =  ME(p.) 
J  J 


Substituting  in  equation  (C.  2), 


v 


ME(p.)[M  -  ME(p.)] 
M2(M+1) 


E(P.)[1  -  E(p.)] 
(M+l ) 


Let  p.  be  the  prior  estimated  value  of  the  variance  of  p.  and 
~  J  J 

y 

let  p.  be  the  value  of  the  variance  of  p.  obtained  when  M  is  used. 

J  J 

Then,  the  problem  is  to  select  M  so  as  to  minimize 


D  = 


k 


l 


,  v  V  x 

w 


2 


k 

E(p  )[1  -  E(p  )] 

v  J  J 

^  z 

j=l 

Pj  M+l 

-17&- 


We  set  dD/dM  =  0  (the  necessary  condition  for  a  minimum)  and  solve  the 

resulting  equation  for  the  least- squares  value  of  M,  denoted  by  M  „  . 

L.  b. 

This  quantity  is  given  by 


M 


L.S. 


^  [E(p  )1  2  [1  -  E(p  )]' 

j=l 


)  pi  E ( p  ) [  1  -  E(p  )] 
(-J  J  J  J 

j  =  l 


(C.  3) 


and  from  equation  (C.  1) 

m.  =  M  E(p.) 

J  J-*  .  O,  J 


(C.  4) 
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APPENDIX  D 


BAYES  MODIFICATION  OF  THE  MULTIDIMENSIONAL 
BETA  DISTRIBUTION 


As  in  section  3.  1,  consider  a  multinomial  distribution  of 

order  k  with  parameters  p  ,  p  ,  ....  p  .  Suppose  the  p.'s  are  a 

i  £  k  j 

priori  jointly  distributed  according  to  the  multidimensional  Beta  dis¬ 
tribution 


.  P 


2’ 


.,xk)  =  fp(xl(x2,  . 


•*mk>- 


Let  E  be  the  event  that  in  n  independent  draws  from  the  multinomial  n. 

th  ^ 

fall  in  the  i  category  (i=l ,  2,  .  .  .  ,  k) 


i  =  l 


n; 


pr(E|x1,x2,...,xk)  =  — 


nl  n2 

X,  X . 


1  2 


,n  '  ~  1  ~2 

k 


Using  Bayes1  rule 


^  I  j  •  •  •  i  X.  ) 

P 1  ’  ?2  ’  '  '  ’  ’  Pk  ^  1  2  k 


pr(E  |  x.  ,  x_,  .  .  .  ,  x.  )  f  (x.  ,  x-,  .  . .  ,  x.  ) 

1  2  k  Pi ,  p2>  .  .  .  ,  p^  1  2  k 


pr(E) 
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where  some  terms  independent  of  the  x.  's  have  been  cancelled  from  the 
numerator  and  denominator. 

Proceeding  exactly  as  in  Appendix  B  this  reduces  to 


f 

l’P2'- 


E(xrx2*  ■ 


,xk) 


,  m,+n  -1  m_+n  - 1  m,+n.  - 

. _ _1 _  11  2  2  k  k 

P(m.+n.  ,  m^+n.,  .  .  .  ,  m. +n,  )  1  ’2  k 

112  2  k  k 

=  Vxl’x2 . xjm^n^m^ . “W 

This  is  seen  to  be  a  multidimensional  Beta  distribution  with  modified 
parameters . 

NOTE: 


The  denominator  of  equation  (D.  1)  was  (before  '.he  above- 
mentioned  cancellation  of  terms) 
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pr(E)  = 


nl * '  n2' 


'  V 


P<ml+nl ,  m2+n2 . 

P(m  ,  m2,  .  .  . ,  mR) 


Let  E  be  the  event  E  with  the  added  stipulation  that  we  know  the  order 
of  the  occurrences.  Then,  clearly 


Pr<E*)  = 


Ptm^n^  m2+n2>  .  .  .  ,  m^+n^) 
(3(m^ ,  m2,  .  .  . ,  rn^) 
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APPENDIX  E 


A  METHOD  FOR  SAMPLING  FROM  THE  MULTIDIMENSIONAL 


BETA  DISTRIBUTION 


Consider  the  variables  p  ,  p  ,  ....  p  having  the  multidimen- 

1  Lt  K 

sional  Beta  distribution 


f  (x  ,  x  ,  .  .  .  ,  x  )  =  f  _(x.  ,  x  ,  ....  x.  m  ,  m  ,  .  .  . ,  m  ) 

p  , p  , ...,pk  1  2  k  p  i  2  k1  1  2  k 

(E.l) 


As  shown  in  equation  (3.5)  the  marginal  distribution  of  p^  is  given  by 


(X1 

mr  1  mi) 

i*l  / 

The  conditional  distribution  of  p^,  P^>  •  •  •  >  given  p^  is  defined  by 


f  I  (x,,x  ,  .  .  .  ,X  X  ) 

P2,  P3>  .  ..,Pklpi  2  3  k  1 


f  (x  ,  X  ,  .  .  .  ,  X  ) 

prp2’ "”pk  1  2  k 

£  (x.) 

P1  1 


^  I  (^->i  X.  >  •  •  •  i  X.  X.  ) 

P2’  P3’*  ’  '  ’  Pk  P1  23  k  1 

,  m,  -1  m  -1  m  -1 

_ 1 _  12  k 

P(m  ,  m  ,  .  .  .  ,  m.  )  1  2  k 

12  k 


1  "V1  „  .tfi-V1 

pfm,,  S  m  )  1  1  '*1) 

V  1  irt  ‘ 
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This  simplifies  to 


3’ 


Pk,Pl 


(x2,x3, 


|X!> 


P(m2,  my  ....  m^) 


k-1 


m  -1 
f  x2  '  2 

1  -x. 


3 

1-x 


m3-l 


where 


k 


I 


x 

i 


1  =  2 


=  1 


-  x 


1 


and 


x.  >  0 


Making  the  substitutions 


r . 
J 


also  y.- 


j  1-x, 


j  =  1.  2, 


and  noting  that  the  Jacobian 


9(P2-  P 3.  •  •  •  .  Pk) 
0(r2>  r3>  •  •  •  ,rk) 


d-Pj) 


k-1 


we  obtain 
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fr2,r3,...,rk|pi(y2’y3’**--yk|Xl) 


1  "V1  m3*1 

P(m2,m3 . m^)  YZ  y3 


"V1 


'*  yk 


=  Vy2’y3 . yk|m2’m3 . mk) 


which  is  a  multidimensional  Beta  distribution  of  one  lower  dimension. 

Now  we  know  that  the  marginal  distribution  of  r^  will  be  the 
simple  Beta 


(y 


2 


Xl>  = 


k 


and  letting 


r . 

J 

s.  =  '  j  =  3,  4,  ....  k 

J  1  -  r. 


we  obtain  the  conditional  distribution 


f  I  (z  ,  z  ,  .  .  .  ,  z  x  ,  y  ) 

s ,  s  s  p  ,  r  3  4  k  12 

3.4  k  1  2 


"  VZ3’  V  '  Zk^m3’  m4 . mk)- 


Continuing  in  this  way  the  conditional  distribution  will, 
eventually,  be  reduced  to  a  simple  Beta.  This  suggests  -he  following 
method  of  sampling  from  the  k-dimensional  Beta  distribution, 

VX1  ’  X2 . \  I  ml  ’  m2’  '  ‘  ’  mk^ : 


-185- 


k 

i)  Draw  w.  from  the  simple  Beta  f„[w  m,  ,  T,  m.l 
1  (J\  1'  1  i/ 


i=2 

k 


ii)  Draw  from  the  simple  Beta  f |  » 


(w,|  m  ,  T,  m.  I 
\  21  2  1=3 


Hi) 


k-1) 


Draw  w,  ,  from  the  simple  Beta  f„(w,  , 

k-1  p  k-1 


mk  - 1  ’  mk} 


Then 


x2  =  w2d-Wi) 

x3  =  w^(l  -w2)(l  -Wj) 


Xk-1  =Wk-l(1-Wk-2)  •••  (1-W2)d-w1) 


and 


k-1 


=  1 


'  l 


i  =  l 


As  shown  in  equation  (E.l),  x  ,  x  ,  x,  are  sample  values  of 

1  b  K 

Pj,  P2>  •••.  P1(..  respectively. 
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APPENDIX  F 


THE  TRANSIENT  SOLUTIONS  FOR  3-STATE, 
DISCRETE  T1ME;  MARKOV  PROCESSES 


This  appendix  presents  a  portion  of  the  results  of  another  study 
23 

performed  by  the  author. 

Consider  a  3-state,  discrete  time,  Markov  process  having  the 
transition  matrix 


Define  <jv(n)  to  be  equal  to  the  probability  that  the  system  will 
be  in  state  j  at  time  n  given  that  it  is  in  state  i  at  time  zero.  Also, 
let 


and 


Silver  E.  A.,  The  Transient  Solutions  for  3-State  Discrete  Time, 
Markov  Processes.  Technical  Note  1.  M.  I.  T.  Operations  Research 
Center.  1963. 
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Then  the  following  flow  chart  and  equations  give 
nation  of  the  parameters  a,  b,  c,  d,  e  and  f. 


<♦>  .(n)  for  any  combi- 
ij 


The  Relevant  Equations 

Vn)=l  -T(1-3a)n  1  * j 

n>0  (F.  1) 

<j^..(n)  =4  +  4  <1'3a)n 
li  3  3 


Convention: 


We  shall  denote  a  transition  probability  by  p  =  pr(state  is  P 

aP 


at  time  n  +  1  |  state  was  a  at  time  n).  Then,  if  we  are  looking  for 

41  (n),  i  j  we  denote  the  third  state  by  k;  if  we  are  looking  for 
ij 

we  arbitrarily  denote  the  other  two  states  by  j  and  k. 

For 

by  noting  that 


For  the  matrix  P^,  wc  can  make  the  notation  even  simpler 


and 


Pij  Pik  Pi* 


Pji  Pjk  Pj* 


Pki  "  Pkj  Pk* 
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FLOW  CHART  OF  METHOD  FOR  OBTAINING  A 


4>  (n)  = 
ij 


Pi*  Pk* 


where 


n  >  0 

i  *  j 


(F.2) 


k  ”  Pj*  Pj*  *  Pi*  Pk^  +  Pj*  Pk* 


„  I*. 2  T 

=  2  V  Pi*  +  Pj*  *  Pk*  -  L 


(F.3) 


Q  =  3px*  Pk* 


and 


s»  =  '>Pl,'p, 


♦  („, .  .  J. 

ii'  1  L  6L 


>  0 


(F.  41 


where  L  p  and  v  are  as  in  equation  (F.3) 


w  -3Pi*<Pj*+Pk*) 


and 


( F .  5) 


x  =  6Pi*(Pj*+Pk*‘Pi*Pj*'Pi*Pk*) 
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where 


“  -  hj  pki  *  hj  % +  pikPkj 


7  =  P.  •  P.,  +  P  ■  P,  •  +  P-  •  P,  •  +  P.,  P-. 

ij  jk  rij  ki  ij  kj  ik  ji 


+  P  ,  P-,  +  P-,  P,  •  +  P-.  P,  • 

ik  jk  ik  kj  ji  ki 


+  P-.  P,  •  +  P.,  P,  • 
Ji  kj  jk  ki 


M  =  p  -  4q 


(F.  6) 


(F.7) 


P  =  P  -  2 
q  =  1  -  p  +  7 

p  =a+b+c+d+e+f 
and  y  is  as  defined  in  equation  (F.6) 

n  >  0 

i  *  j 

where 

v  =  iu 


4>..(n)  =  y- 
ij  7 


_1_ 

27 


(F.8) 
(F.9) 
(F. 10) 


(F.  11) 


M  is  defined  in  equation  (F.  7) 


2  2  2 

P  =  2p. .  p  +  p. .  p  .  +  p..  p  +  2p  .  p  p.. 

ij  jk  rij  rki  rij  kj  rij  ik  ji 


+  2p.  p  p  +  p  .  p..  p,  .  +  p..  p..  p,  . 

ij  ik  jk  ij  ji  ki  ij  ji  kj 


+  hj  pjk  pkl 
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(F.12) 


y  and  a 
tion  (F. 

4>.  •  (n)  = 
11 

where 


*  Pij  Pik  Pki  '  Pij  Pjk  Pkj  *  2Pij  Pk!  Pkj 


*  Pij  Pki  -  Pij  Pkj  *  Pik  Pji  Pkj  *  Pik  Pjk  Pkj 


-  Pik  Pki  Pkj  '  Pik  Pkj  -  Pik  Pkj 


are  as  defined  in  equation  (F.6)  and  p  is  as  defined  in  equa- 


3). 

6_  J_ 

y  ~  Zy 


CFf  KI'fffHI 


n>0  (F.  1  3) 


6  =  P-.  P,  .  +  P.,  P,  .  +  P..  P,  ■ 

Ji  ki  jk  ki  ji  kj 


w  =  6  -  y 


*  -  pik  pkj '  hk  pji  •  pik  pjk  -  pij  pki '  pij  % 

-  pfj  pjk 4  pij  pki 4  p,k  4 4  hj  % 

2  2  ^ 

+  p  p  +  p  p  +  P-  •  P  i  -  P  ,  P. .  P,  . 

ik  ji  ik  jk  Jk  ik  ij  *ki 


2pik  hj  pkj '  2p,k  pij  pjk 


(F.ll) 


•pik  Pki  pji  •  P,k  Pkl  Pjk  •  P,k  pij  Pji 


-  Pij  Pk,  Pji  '  Pij  Pkj  Pji  4  Pik  pki  Pkj 
4  2pij  Pki  Pkj  4  2pik  Pkj  Pjk  4  2P>J  Pkj  Pjk 


+  2p  p..  p..  +  p  .  p..  p.. 

ik  rjj  jk  rij  *ji  jk 
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and  y  p  and  v  are  as  defined  in  equations  (F.6),  (F.8)  and  (F.12), 
respectively. 


4>  (n) 
ij 


a 

7 


n/2 

^ —  [a  cos  n©  +  (3/y 


sin  n©  ] 


n  >  0 

i  *  j 


(F.  15) 


■where 


y  =  v  -M 

M  is  defined  in  equation  (F.7)  (F.16) 

6  =  tan  1  ( -y/( -p)) 

and  a  (3 ,  y,  q  and  p  are  defined  in  equations  (F.6),  (F.12),  (F.6), 
(F.9)  and  (F.8),  respectively. 

,  n/2 

<t>..(n)  = - -  [-w  cos  n©  -  x/(y)  sin  n©  ]  n  S  0  (F.  17) 

n  7  7 

where  6,  w  and  x  are  defined  in  equation  (F.14);  y  and  ©  are 
defined  in  equation  (F.  16);  and  y  and  p  are  defined  in  equa-  (F.  18) 
tions  (F.6)  and  (F.8),  respectively. 


1J 


T  n 

2(j.nT 
(P  -2 )p 


a  n 

—  T 

7 


n  5:  0 

i  *  j 


(F.  19) 


where 


P  - 


P.  .(-P.  -P-,  -P  . -P-,  +P,  +p.  .)  + 
rij  rij  rik  rji  jk  ki  kj 


2p,k  pkJ 


2  -  p 

T  = - —  (F .  20) 

and  a  y  and  p  are  defined  in  equations  (F.6)  (F.6),  and  (F.10), 

re  spectively . 


-194- 


+ 


n  5  0 


(F.  21) 


,  .  ,  _  6  .  nnT  7  -  6  n 

^ii  n  7  +  (p  -2)p  7 


where 


2  2  2  2  2  2 

n=  p. .  +  p..  -  p..  -  p.,  -  p,  .  -  p,  -  +  2(p. .p  +p..p, . 

ij  lk  ji  jk  ki  kj  i.]  ik  ji  ki 


+pjipkjtpjkpki-pjipjk-pjk''kj-i>k?kj» 


(F.  22) 


and  6,7  ,  p  and  t  are  defined  in  equations  (F.  14),  (F.6), 
(F.10),  and  (F.20),  respectively. 


(P. „  (  P-1)  P-2  "  a 

41.  .(n)  =  -  r~  6(n)  +  - - ^  + - ^ 

ij  p-  2  P  -  1 


,n-l 


I -  (2-p)  n  >  0  (F.  23) 

i  *  J 


where  a  and  p  are  defined  in  equations  (F.6)  and  (F.10) 
and  6(k>  '  •  -Pr  k  =  ° 


k)  =f‘  f<> 

(.0  Ot 


(F .  24) 


her  wise 


V") 


(  p-2+p  -6) 

- 6(n)  + 

p  -  2 


(/>•!)  *>u  -  6 
p  -  1 


(2-p  )n_  1  n  >  0 


where  6,  p  ,  and  6 (k)  are  defined  in  equations  (F.  14), 
(F.10)  and  (F.24),  respectively. 


(F.  25) 


(F .  26) 


4\ j(n)  =  (-«)  6(n)  +  (p_-c*)  6(n-l)  +  a 


n  2=  0 

i  *  j 


(F .  27) 


where  a  and  6(k)  are  defined  in  equations  (F.6) 
and  (F.  24). 


(F .  28) 


4>..(n)  =  (1+6)  6(n)  +  (l-p..-p  -6)  6(n-l)  +6  n  ^  0 

11  Jh 


(F.  29) 


where  6  and  6(k)  are  defined  in  equations  (F.  14) 
and  ( F.  24). 
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APPENDIX  G 


THE  USE  OF  THE  HYPERGEOMETRIC  FUNCTION  IN  THE 
DETERMINATION  OF  THE  EXPECTED  VALUES  OF  THE  STEADY 
STATE  PROBABILITIES  IN  A  SPECIAL  2 -STATE  MARKOV 
PROCESS. 


1  -  a 


1  -  b 


where  "a"  is  assumed  exactly  known,  but  "b"  has  the  Beta  distribution 


fb(x)  =  fp(x|m,  n) 


G.l.  Determination  of  the  Expected  Values  of  the  Steady  State  Proba- 
bilities 


For  a  given  (a,b)  pair 


2  a  +  b  ) 

the  steady  state  probability  of  being  in  state  2. 

1 

E(7r2>  =E(db)  rf^£b(x)dx 


1 


C  a  1  m-i 

J  a  +  x  (3(m,  n) 


(1  -x)n  ^  dx 
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This  is  not  an  easy  integral  to  evaluate  in  its  current  form. 
However,  the  substitution  w  =  1  -  x  leads  to 

1 

E<V  =  rh  iidboj’  »''-1  (i  -  d„  (G.D 

0 

It  is  fortunate  that  this  integral  has  appeared  elsewhere,  namely  in  the 
solution  of  differential  equations  arising  in  certain  physics  problems. 
More  precisely,  we  have  the  hyper  geometric  function,  F,  defined  by^ 

1 

F(a,  c-b  [  c  |  z)  =  r(b^pjc_b)  j  wC  b  1  (l-w)b  1  (1-wz)  a  dw  (G.2) 

0 


F  is  tabulated  only  for  values  of  the  4  parameters  relevant  to  the 
physics  problems  for  which  the  function  was  first  conceived.  Unfortu¬ 
nately,  those  values  are  not  appropriate  for  the  Markov  process  analysis. 

Therefore,  we  must  find  a  convenient  way  of  calculating  F. 

25 

F  is  expressible  as  a  convergent  series 


F(p,  q  |  r  |  z)  =  1 


+  PS5. 

ri: 


p(p+i)  q(q+i)  2  . 

r (r+l )  Z! 


(G.3) 


(convergent  provided  |z|  <  1). 

Using  equations  (G.l)  and  (G.2),  there  results 


Morse,  P.  M.  and  Feshbach,  H. ,  Methods  of  Theoretical  Physics, 
Parti,  McGraw-Hill,  1961,  p.  591. 

25 

Ibid,  p.  388. 
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E{7rz)  =  rh  F(1-nlm+nliii) 

Then  utilizing  equation  (G  3)  (valid  ‘  .  ’  l/(a+l)  <  1),  we  obtain 


a 

~i  +  n  i 

f_L  ) 

i  |  n(n+1 )  | 

f_L_V 

a+1 

m  +  n  i 

l  a+1  ] 

'  (m+n)(m+n+l)  ' 

U+l) 

.  .  .  + 


_ n(n+l)  ...  (n+k-2) _ 

(m+n)(m+n+l)  ...  (m+n+k-2) 


,  k-1 

(a+i  )  +  Rk  ] 


th 

k  term  counting  the  1  as  the  first  term 


where 


n(n+l)  ...  (n+k-1)  /  1  \ 

k  | 
1  | 

1  +  n  +  k  1 

<-L\ 

(m+n)(m+n+l)  ...  (m+n+k-1)  (a+1  ] 

m  +  n  +  k  ' 

[  a+1  ) 

(n+k)(n+k+l)  /  1  \Z 

(m+n+k)(m+n+k+l )  (  a+1  ) 


But 


n+  j 

m  +  n  +  j 


<  1 


R.  < 


n(n+l)  .  .  .  (n+k-1)  _ 

k  (m+n)(m+n+l)  ...  (m+n+k-1) 

_ n(n+l )  .  .  .  (n+k-1 ) 

(m+n)(mtntl)  ...  (m+n+k-1) 


k  r- 

(a+1  )  j 1  +(a  +  )  +  (a+1 


(  1 

\  fa+l\ 

1)  1 

^  a+1 

)  \  a  J 

)  th 

ere 

follows 

(G.4) 
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a 

ii  n  j 

1  ,  n(n+l) 

a+1 

_  +  m  +  n  1 

i  a+1  J 

I  (m+n)(m+n+l) 

n(n+l)  ...  (n+k-2)  /_l\k‘1“|+F 

(m+n)(m+n+l)  ...  (m+n+k-2)  \  a+1  /  J  k 


(G.5) 


where 


F  _  _ a_  n(n-H)  . .  .  (n+k-1)  /  _1_  \k 

k  a+1  k  (m+n)(m+n+l)  ..  .  (m+n+k-1)  \  a+1  / 


(G.6) 


It  is  clear  that  E^  can  be  made  arbitrarily  small  by  choosing 
k  sufficiently  large.  Hence,  we  can  come  arbitrarily  close  to  by 

selecting  a  large  enough  k. 

Finally, 

E(nl)  =  1  -  E(tt2) 


G.2.  Asymptotic  Check  on  the  E(tt?)  Formula 


In  equation  (G.  5)  suppose  we  fix  m/(m+n)  -  b  and  let  m  and  n 
both  tend  to  infinity  (this  is  equivalent  to  saying  that  "b"  is  exactly 
known).  Then, 


lim  ,  lim  a 

E(ff  )  =  - 

m,  n-*“0  2  m,  n-*w  a+1 


1  + 


1  n 

+ 


+na+l  m+n 


+  — 


m+n  m+n / 1 \ 

- ; - Ut! 


1  + 


m+n 


a 

a+1 


1  +  (!-b) 


1 

a  +  1 


+  C'-b) 


1  -  b  +  0 
1  +  0 
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TThfrMiir)'*-  •] 


a  +  1 


a  +  1  1  -  b 

a  +  1 


a  +  1  a  +  1  -  1  +  b 


a  +  b 


•which  is  the  exact  steady  state  probability  when  "a"  and  "b"  are  known 
exactly. 


G.3.  Monotonic  Behavior  of  E(7r2)  as  a  Function  of  m  +  n  for 


Fixed  E(b) 


n  4-  j  m  +  n  m  +  n 


m  +  n  +  j 


1  + 


m  +  n 


1  -  b  + 


J 


m  +  n 


1  + 


m  +  n 


(m/(m+n)=E(b)=b) 


d(^+j)  (1+m+n)(-(m+n). 


1  b+  — 


m+nA  .  .2 

(m+n) 


djm4n) 


■bj 


<  0  for  j  >  0 


Therefore  (n  +  j]/(m+n+j)  decreases  monotomcally  as  m+n  increases. 
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But,  we  observe  from  equation  (G.5)  that  each  term  in  is  a 

multiple  of  factors  of  the  form(n  +  j/(m+n+j)  and  terms  that  don't 
depend  on  m  +  n.  Hence,  for  fixed  E(b),  E(ff^)  monotonically  decreases 
as  m  +  n  increases. 
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APPENDIX  H 


PORTIONS  OF  THE  E  {  ir^)  VALUES  OBTAINED  THROUGH 
THE  USE  OF  THE  HYPERGEOMETRIC  FUNCTION 


Consider  the  2 -state  Markov  process  with 


P  = 


1  -  a 


b 


where  "a"  is  exactly  known  but  f^(x)  =  fp(x]  n*,  n).  The  following  tables 
give  values  of  E  ( )  accurate  to  5  significant  figures  for  the  various  com¬ 
binations  of  a,  b  =  m/(m  +  n),  and  m  +  n.  These  are  only  portions  of  the 
results  obtained  using  a  computer  program  and  equations  (4.  1)  and  (4.  2). 
(Presentation  of  the  entire  results  would  have  required  a  prohibitive  amount 
of  typing). 


a  =  0. 2 


b  0.2 

0.4 

0.  6 

0.8 

1  0 

. 54432 

. 35584 

. 25988 

.  20341 

30 

.  51606 

. 34080 

.2531  9 

. 20  109 

50 

.  50979 

. 33780 

. 25190 

. 20065 

1  00 

. 50495 

. 33556 

.  25094 

. 20032 

200 

. 50249 

. 33445 

.  25047 

. 20016 

500 

. 50100 

33378 

. 25019 

20006 

1000 

. 50050 

. 33355 

.  25009 

. 20003 

00 

. 50000 

. 33333 

.  25000 

. 20000 
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a  =  0 . 4 


m+  n\ 

b  0.2 

0.4 

0.  6 

0.8 

10 

. 69196 

.51785 

. 40962 

. 33714 

30 

. 67601 

.50616 

. 40321 

. 33458 

50 

. 67240 

.50372 

. 40192 

. 33408 

100 

. 66958 

. 50187 

. 40096 

. 33370 

200 

. 66813 

. 50093 

. 40048 

. 33352 

500 

. 66726 

. 50037 

. 40019 

. 33341 

1000 

. 66696 

. 50019 

. 40009 

. 33337 

00 

. 66667 

.  50000 

. 40000 

. 33333 

a  =  0 . 6 


m+  n\ 

b  0.2 

0.4 

0.6 

0.8 

10 

. 76598 

. 61338 

. 50813 

.43208 

30 

. 75590 

.  60469 

. 50276 

.42974 

50 

. 75362 

.  60284 

. 50166 

. 42927 

ino 

75184 

.60143 

. 50083 

. 42892 

200 

. 75093 

.  60072 

. 50041 

. 42875 

500 

. 75037 

.  60029 

. 50016 

.42864 

1000 

.  75018 

. 60014 

. 50008 

. 42860 

00 

. 75000 

. 60000 

. 50000 

. 42857 

-203- 


a  =  0. 8 


m+  n\^ 

b  0.2 

0.4 

0.6 

0.  8 

10 

. 81095 

. 67687 

. 57815 

.50308 

30 

. 80403 

. 67027 

. 57373 

. 50104 

50 

. 80247 

. 66885 

. 57282 

. 50062 

100 

. 80126 

. 66777 

. 57212 

. 50031 

200 

. 80063 

. 66722 

. 57178 

. 50015 

500 

. 80025 

. 66689 

. 57157 

. 50006 

1000 

. 80013 

. 66678 

. 57150 

. 50003 

00 

. 80000 

. 66667 

. 57143 

. 50000 
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APPENDIX  I 


SUMMATION  EXPRESSION  FOR  E (ir  )  FOR  A  E-STATE 
- A _ _ 

PROCESS  WHERE  BOTH  TRANSITION  PROBABILITIES 
ARE  INDEPENDENTLY  BETA  DISTRIBUTED 


i-  a  a 
b  l  -b 


fa(*)  -  +£  (X  /H,  ,  0 ,) 

A(y)  =  fji(ylK'*,nA) 

uihcre  Ki  ,n(|  <3ncf  na  arc  positive  integers 

O  V0  *  T^' 

=  S0‘ X*' 0-x)*r'Jo  T+j-y  n±~ 'O-uV^'dydx  .-.(r.ij 


Substitute,  2  =  x+y 


',+*  • 


(z-x)  *  (i+x-i)  A  dz 


'•  i  =  y 

-  c+iz  ( v)(-*r  *  -iGl'  ( r;o~A-*ri  -« 


k'O  v  ’  j‘0 

The.  fern  uj  Here  fc  =  and  j~ox-(  wi((  produce  a  logarithm 


Therefore ,  cue  mast  separate  it 


n»- a 


Ha+ni-a-k->y  J  j-0  v  ^  ' 

.  f  X^l  _  ,  vnW/  .n*-. 

L  nx-i- J  fli-i-j  J  +  (-*)  (it*)  Jjx 


i-K 

< 


Substituting  this  expression  info  (j-/j  /eads  to 
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«•) £(rj  =  fo  xn‘  (i-x)n'  '  f (\'J(~/)kxl 

-n^^xciitoXL  -  icrio^r 


[d-Kl 


n^-2- k-J 


J  =6 


«.+n.-i-kv  J  L  jso  -  — 

°*-a  /nA-i)/(i  » >>  /  Ox-i-j  7  kx-i  .nx-f  .nx-i  i**2  j, 

-  .&■  (  J  iO*/  (-'j  _* _  +  X  (’t)  (H-x)  iA,ircd* 

J"°  r\-i-j  J  J 

Expanding  the  (/Txj  terms ,  coe  obfa't rt 

n.-a  nx-i  n^-A-lc  (/  .  „  il/n^vi-t] 

l(n„»,)f(^,«J£W=  .*■  S-  7*,7(-'J  (  J  A.  r/ 

fc*o  g=o  r  V - 


('  *,-hk  +  r  n,-J 

•jtx  ,  (i-xj  dx 

<^i-x  nA~f  j 


t 


fc-o 


u-  o  r<o 


( "r'k-ir^cyH r)  c\"' 

Tn,x+n^-JL- Ic-j  o 


Hx+r\^-i-k-j 


+nx^+r-A-j 

(f-xj  <fc 


U*°  r-o  - : - 71 -  J. 


,n,-i 


r«o 

*1^  A*- A.  J 


0-Y>  '  dx 


fh 


+c-<r,^  cvH-ir^Cr)  n^+'+r-^(i _x)^4k 

\->-j 

/  "»-<)/ n,-<  ),  \t  ft  n(+nx-i4^+t 

^  s  /C  -fc  A  '  J  *  4/ \y(i-*y)dx 


'J~°  Tt=  O 


•f-0  -tro 


The  "first  four  terms  involve  Seta  integrals;  the  fast  fujo 
are  evaluated  through  the  use  of  the  following  formulas  taken 
from  integral  tables 


26 


DeHaan,  D.  B.  ,  Nouvelles  Tables  D'lnt^grales  Definies,  Hafner,  1957, 
Tables  106-107. 
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T'jzlll 


J0  y3'0'  dy  =  ia  +  (  iIa -i-  +  la+l  j&-  u 

r  »{a  ( i+^j  ^ 2a>  dij  =  -4 —  ^  i~'i u ' 


■2d 


U=r  |  U 


and  Jo  (JAtj)b(i-y*)  3?  >d3  *  C-Ob  t>\  jk  C 


Fh 


Then 


n4-i  n^-r  /  n  i  )  /  I  r'i+k'<'J/nJ.-/)/ni+oi-i-fc  I 

<€  5=_  <==L  C  ic  /H  fji(  r  ) 


k=o  ./* 


j  -  o  r=  o 


M'a  +  n*  W 


■  £  (n.+  k+rn,  n,  ) 

"w  «*-»  J  /  «... / 1  ✓  .  r*+k  V  «*-«  I /  J 

+  ^  ^ 

*s°  ^=0  rr«  K.a  +nA  -i-k-j 

4(-on-'.r  £  (v-V(-ijn‘-'-J(vj 

J*0  re0 

,  I  H..  ni"J  J 

+  (-')  A  ^  ssl 

a-0  r-o 


Hx-'-J 


-hr,  n,) 


nx-i-J 


nx-<  n,-/ 


f(_l)  ^  ^ 

+(-'r‘£ 

•s~°  7  c=o  (n,+  »ia  +vf+itj A 


ajftece.  J(,  +  -ft) 


K,+  MA+S+"t  u=| 


u)her\  ^ ,+  >n^+s+-fc  ii  ev/en 
f\  ,+**+*+ 1  _  ,  u 


M,+m.j.+s+t 


Xjv  A-  -h 


*i,+nl+.s'f't  u  *  i 


(^) 

K 


(UKdo  T)(  +  (Ax^5+-t 
<5  odd- 
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APPENDIX  J 


DETERMINATION  OF  THE  EXPECTED  VALUES  OF  THE  STEADY 
STATE  PROBABILITIES  FOR  A  SPECIAL  3  -  STATE  PROCESS 


a  b 

1  -  c  -  d  d 

f  1  -  e  -  f 

Where  b,  c,  d,  e  and  f  are  all  exactly  known,  but  t  =  is  Beta 

distributed,  i.  e. 


ft(x) 


fp  (x  |m,  n) 


1 

P  (m,  n) 


m-1 

X 


/I  \tt-l 

(1-x) 


0  £  x  £  1 


From  section  4.  1.  1,  we  know  that 


tt  .  = 


ce  +  cf  +  de 

ad  +  ae  +  af  +  be  +  bd  +  bf  +  ce  +  cf  +  de 


ce  +  cf  +  de 
1-b 


(d  +  e  +  f) 


be  +  bd  +  bf  +  ce  +  cf  +  de 
_¥+ - - 


and  it  p 


ae  +  af  +  bf _ 

same  denominator 


bf 

_ TT5 

same  denominator 


(e  +  f>7TE  + 


A 

Bt  +  C 


and  tt  2 


Dt  +  G 
Bt  +  C 


Where  A,  B,  C,  D  and  G  are  constants  defined  by 

ce  +  cf  +  de 
A  -  i-b 

. .  (J.  1) 

B  —  d  +  e  +  f 
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c  = 


be  +  bd  +  bf  +  ce  +  cf  +  de 
1-b 


D  =  e  +  f 


. .  .  (J.  1) 


and  G  = 


bf 

Th 


Fortunately  the  situations  here  are  closely  related  to  those  studied  in 
Appendix  G. 


E(l,l)  =  E(BtT“Cl  ■  TT 


£ 


i  t”1-1  (i  - 1)11-1 


0  tTg- 


P  (m,  n) 


dt 


Setting  w  =  1  -  t  and  simplifying  gives 


-  ,  i  _  A  1 

E  (lr  l'  B  +  C  p(m,  n)  J 


f1  n- 1  .m-  1  r  B  i  1 

\  w  (1-w)  [  1  '  w  "BTC  1 


dw 


A  B  B 

.  *  .  E  (it  j)  =  -g-f-Q  F  (1,  nf  m+n|  B  +  c  )  (convergent  because  ^  +— ^  <  1  ) 


Where  A,  B  and  C  are  defined  in  equation  (J.  1). 


Now  E(„2)  =  El-g^)  +  E 


Clearly,  the  second  term  presents  no  difficulty  as  it  is  similar  in  form 
to  that  appearing  in  E  (ir^).  The  other  term  is  treated  as  follows. 

^1 

,  Dt  x  _  D  1  f  t  tm-l  ,,  ^n-1  , 

Bt  +  C)  "  B  (3  (m,  n)  JQ  "  +"TT  '  (1_t)  dt 

B 

Substituting  w  =  1-t  and  performing  some  algebraic  manipulations  leads  to 


F  (  Dt  _  )  -  -J1  I  m 
E  Bt t c  Brr(m+ 


,  P(m  +  n+l)  f1  n-  1 .  ,m  r 

n  FtWfT  .  „  W  11 


B  ,-l  , 

B+U  w]  dw 
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Using  the  definition  of  F  (equation  (G.  2)  of  Appendix  G) 


_  .  Dt  .  D  .  m  .  _  .  ,  ►  ,iB. 

E  Bt  ]  =  BTZ  {^TT^)  F  U.njm  +  n  T  1  I  ) 


Hence,  Ef-rr^  =  (~^r)  F  (1,  n  jm+n+ll-^  -)  +  F  0.  n|m+nj~^«r) 


Finally,  E  (rr  =  1  -  E  (ir  j)  -  E  (ir 
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APPENDIX  K 


TABLE  OF  SIMULATION  RESULTS  FOR  3 -STATE  STEADY 
STATE  BEHAVIOR  WHEN  THE  TRANSITION  PROBABILITIES 
ARE  MULTIDIMENSIONAL  BETA  DISTRIBUTED 


I&pt  Sample 

No.  Size  (;r  ) 

1  e? 

,  rH 

1  fc 

W  Sfl  1 

[O 

2  ex. 

-  \s 
"2  \ 

Within 
95% 
Gon£- 
n  Region 

xto-3 

TPJO. 

1 

700 

.3368 

.3370 

4.32 

Yes 

.2211 

.2176 

3.81 

w- 

0.  7 

m 

700 

.2528 

.2470 

2.88 

Y 

.31,99 

.3159 

3.16 

H 

2.0 

H 

700 

.2988 

.2929 

2.20 

No 

.3382 

.3342 

2.24 

0 

2.  0 

H 

750 

.4094 

.4101 

1.38 

Y 

.2894 

.2873 

1.06 

N 

0.  4 

5 

700 

.3807 

.3685 

5.49 

N 

.2749 

.2720 

3.65 

Y 

3.0 

6 

700 

.4361 

.4278 

3.67 

N 

.2349 

.2387 

3.34 

Y 

1.7 

'  7 

700 

.5617 

.5652 

5.63 

Y 

.2043 

.1900 

2.84 

N 

2.  5 

8 

700 

.3099 

.3005 

2.90 

N 

.2949 

.2878 

2.60 

N 

3.  3 

700 

.6450 

.6572 

3.85 

N 

.1512 

.1471 

1.99 

Y 

2.  4 

700 

.4094 

.4082 

2.04 

Y 

.2894 

.2885 

1.65 

Y 

0.  3 

700 

.4775 

.4908 

5.75 

Y 

.3708 

.3691 

5.63 

Y 

2.  6 

700 

.1470 

.1385 

3.00 

N 

.5245 

.5223 

5.42 

Y 

2.  1 

700 

.4266 

.4406 

4.91 

N 

.3961 

.3907 

4.55 

Y 

2.  8 

14 

700 

.3327 

.3307 

2.43 

a 

.3434 

.3410 

2.36 

Y 

0.9 

15 

700 

.2523 

.2512 

2.73 

H 

.4997 

.5090 

3.87 

N 

1.8 

16 

700 

.3467 

.3455 

2.35 

1 

.3552 

.3569 

1.93 

Y 

0.  3 

17 

1200 

.2727 

.2980 

5.27 

N 

.5455 

.5306 

5.90 

N 

5.  1 

18 

700 

.2527 

.2378 

3.13 

N 

.2780 

.2772 

4.11 

Y 

3.  2 

19 

700 

.2316 

.2288 

2.61 

1 

.2692 

.2624 

2.59 

N 

1.9 

20 

700 

.3494 

.3537 

2.00 

B 

.3031 

.3033 

1.84 

1 

0.9 

21 

600 

.1404 

.1377 

2.24 

a 

.1930 

.1913 

2.57 

B 

0.8 
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Expt  Sampl 
No.  Size 

e 

(O 

1  e: 

r 

Within 

95% 

g  Gon£  j 

\  ^ionrzL 

x  10  / 

1 

Within 

9S% 

S-  Con£ 

7r  Region  TPD. 

z  ,3 
xlO 

22 

800 

.3882 

.3901 

2.69 

1 

.2824 

.2820 

2.70 

Y 

0.4 

23 

800 

.3882 

.3922 

2.09 

1 

.2824 

.2783 

2.09 

Y 

0.8 

24 

1050 

.2727 

.2848 

4.00 

N 

.5455 

.5362 

4.55 

Y 

2.4 

25 

700 

.2527 

.2467 

1.85 

N 

.2780 

.2808 

2.45 

Y 

1.2 

26 

750 

.4094 

.4101 

1.38 

Y 

.2894 

.2873 

1 .06 

Y 

0.4 

27 

1000 

.2000 

.2012 

3.77 

Y 

.4000 

.4085 

5.01 

Y 

1.9 

28 

1000 

.2000 

.1955 

2.26 

N 

.4000 

.3992 

3.02 

Y 

1.  1 

29 

1000 

.2000 

.1996 

1.76 

.4000 

.4011 

2.31 

Y 

0.  2 

30 

1000 

.2857 

.2864 

1.87 

K 

.2987 

.3016 

2.21 

Y 

0.  7 

31 

1000 

.3929 

.3916 

1.97 

.3750 

.3740 

2.05 

Y 

0.4 

32 

1000 

.2368 

.2377 

1.56 

I 

.4211 

.4209 

1.34 

Y 

0.  2 

33 

1000 

.2148 

.2142 

1.21 

.4003 

.3989 

1.  16 

0.4 

34 

1000 

.3593 

.3592 

0.95 

.3353 

.3351 

1.01 

B 

0.  1 

35 

1000 

.2527 

.2506 

0.98 

N 

.3571 

.3585 

0.87 

B 

0.  4 

36 

1000 

.2710 

.2691 

2.84 

Y 

.2897 

.2907 

2.82 

B 

0.  4 

37 

1000 

.2963 

.2970 

3.35 

Y 

.3333 

.3421 

4.16 

N 

1.9 

38 

1000 

.2353 

.2416 

3.03 

N 

.5882 

.5858 

3.63 

Y 

1.3 

Note:  (»„)  and  7 r„  can  be  obtained  from  7r .  +  7r„  +  7r„  =  1 . 
— -  3ex.  3  12  3 

T.  P.  D.  is  defined  in  equation  (4.  4). 

(See  Legend  for  Experiment  Numbers  on  next  page.) 
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1 

3, 

2,  3/  2,  2,  3/  2,  1,  3 

L,  L,  L  /  L,  L,  L/ 1_»,  j_i,  l 

2 

2, 

12,  7/  2,  1,  3/  2,  2,  3 

Xj,  14,  H/  L,  L,  L/ L,  L,  L 

3 

3, 

12,  8/11,  3,  8/  2,  3,  3 

L,  H,  H/H,  L,  H/L,  L,  L 

4 

3, 

8,  9/11,  1,11/11,  11,  3 

L,  H,  H/H,  L,  H/H,  H,  L 

5 

9, 

3,  2/  2,  2,  3/  1,  2,  3 

H,  L,  Xj/  Xj,  Xj,  Xj/ Xj,  Xj,  Xj 

6 

11, 

8,  8/  2,  1,  1/  3,  1,  3 

H,  H,  H/Xj,  Xj,  Xj/Xj,  Xj,  Xj 

7 

9, 

2,  1/ 11,  2,  1 1/  1,  2,  2 

H,  Xj,  Xj/H,  Xj,  H/L,  Xj,  Xj 

8 

7, 

9,  10/10,  2,  8/  1,  2,  2 

H,  H,  H/H,  Xj,  H/L,  L,  L 

9 

12, 

1,  3/12,  3,12/12,  12,  2 

H,  Xj,  Xj/H,  Ij,  H/H,  H,  L 

10 

12, 

11,  9/  9,  3,  11/  9,  7,  3 

H,H,  H/H,  Xj,  H/H,  H,  L 

1 1 

7, 

1,  2/  3,  8,  1/  2,  3,  1 

H,  Xj,  Xj/Xj,  H,  Xj/Xj,  Xj,  Xj 

12 

12,  12,  10/  1,  9,  3/  1,  2,  3 

H,  H,  H/L,  H.  L/L,  L,  I. 

13 

8, 

2,  2/  2,  8,  3/11,  11,  2 

H,  L,  L/L,  H,  L/H,  H,  L 

14 

12, 

12,  10/10,  1  1,  11/3,  3,  3 

H,  H,  H/H,  H,  H/L,  L,  L 

15 

9, 

8,  10/  2,  8,  3/  7,  9,  3 

H,  H,  H/L,  H,  L/H,  H,  L 

16 

1  1, 

8,  8/  7,  7,12/  7,  10,  2 

H,  H,  H/H,  H,  H/H,  H,  L 

17 

8, 

1,  1/  1,18,  1/  3,  3,  14 

H,  L,  L/L,  H,  L/L,  L,  H 

18 

11, 

8,  9/  3,  8,  3/  2,  1,  7 

H,  H,  H/L,  H,  L/L,  L,  H 

19 

8, 

7,  8/  8,  9,  12/  2,  3,  8 

H,  H,  H/H,  H,  H/L,  L,  H 

20 

1  1, 

8,  10/10,  9,  1 1/10,10,  10 

H,  H,  H/H,  H.  H/H,  H,  H 

21 

10, 

10,  20/10,  15,  25/  2,  3,  15 

Test  Sequence 

22 

1, 

2,  2/  4,  1,  5/  6,  3,  1 

Low  Diagonal,  reasonably 
low  elsewhere 

23 

2, 

4,  4/  8,  2,  10/12,  6,  2 

2  x  No,  22 

24 

16, 

2,  2/  2,  36,  2/  6,  6,  28 

2  x  No.  17 

25 

33,  24,  27/  9,  24,  9/  6,  3,  21 

3  x  No.  18 
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Expt.  No.  M 


26 

24, 

22, 

18/18, 

6, 

22/18, 

14, 

6 

27 

6, 

2, 

2/ 

1, 

7, 

2/ 

1, 

2, 

7 

28 

18, 

6, 

6/ 

3, 

21, 

6/ 

3, 

6, 

21 

29 

30. 

10, 

10/ 

5, 

35, 

10/ 

5, 

10, 

35 

30 

15, 

6, 

9/ 

8, 

24, 

8/ 

8, 

6, 

26 

31 

15, 

9, 

6/12, 

15, 

3/ 

6, 

9, 

15 

32 

25, 

10, 

15/10, 

20, 

20/ 

5, 

30, 

15 

33 

10, 

22, 

8/10, 

15, 

25/ 

8, 

16, 

14 

34 

3, 

15, 

12/24, 

4, 

12/16, 

16, 

8 

3b 

4, 

20, 

16/ 

8, 

4, 

28/20, 

25, 

5 

36 

3, 

1, 

6/ 

2, 

3, 

5/ 

3, 

4, 

3 

37 

14, 

2, 

4/ 

2, 

16, 

2/ 

6, 

4, 

30 

38 

32, 

4, 

4/ 

3, 

54, 

3/ 

6, 

12, 

42 

Category 

2  x  No.  10 
Miscellaneous 
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APPENDIX  L 

THE  FORM  OF  THE  STEADY  STATE  PROBABILITIES  IN  AN 
N-STATE  MARKOV  PROCESS 


For  the  simulations  of  the  steady  state  behavior  of  3  and  4 
state  processes  (see  sections  4.  1. 6  and  4.  1.  7),  it  was  necessary  to 
write  out  expressions  for  the  steady  state  probabilities  for  the  cases 
of  exactly  known  transition  probabilities.  The  purpose  of  this  appen¬ 
dix  is  to  point  out  some  general  properties  of  the  form  of  the  steady 
state  probabilities  which  became  evident  from  a  study  of  the  3  and  4 
state  cases. 


In  general,  to  obtain  the  steady  state  probability  vector 

ir  =  (ir  ,  irpJ  .  ,  ,  ir  )  for  a  given  transition  matrix  P  we  solve  the 
1  N  N 

set  of  equations  it  P  =  tt  and  ^  2  ^  tt,  =  1 . 


For  3  states  with  P  = 


l-a,-a2 


L  c‘ 


1  -  b  -  b 
1  2 


1-C1  -C2 


we  obtain  ir 


Vl  +biC2  +  b2Cl 


1  a?2  +  aiCl+aiC2  +  a2bl+a2b2  +  a2C2  +  biCl+blC2  +  b2Cl 


alcl  +  alC2  +  a2C2 


alb2+  a2bl  +  a2b2 
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For  4  states  with 


F  = 


1-ai'Va3 


1-bI-b2-b3 


1_crc2'c3 


l-d1-d2-d3 


TT  = 
1 


{blCldl+blCld2'blCld3  +  blC2dl  +blC2d2  +blc2d3+blC3dl+blC3d2  + 
VlVViVVjVVsVVl*!  1  Vl  d3  +  b3C2dl  +  b3C3di 

n 


Where  D  =  sum  of  4  expressions  {64  terms  in  all)  similar  to  the  numerator 

of  IT  j . 

(The  other  3  expressions  are  the  numerators  of  u  it  3  and  tt4). 


Also,  for  2  states  with 

P 


b 

17  1  a  +  b 


1  -  b 


and  ir  2 


a 

a  +  t> 


From  the  above  results  for  the  2,  3  and  4  state  systems,  we  can 
now  suggest  the  following  general  statements  about  the  form  of  the  steady 
state  probabilities  of  an  N-state  Markov  process. 


i)  For  a  given  process, 
the  same  denominator  D,  i.  e. 


every  steady  state  probability  will  have 

N.  ,  _  N  .. 

t  =  1  where  D  =  <r  N. 

J  ET  j  =  l  J 
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N-  1 

ii)  For  an  N-state  process,  D  is  the  sum  of  N  terms  - 

s  t 

these  are  all  the  possible  (N-l)  order  cross-products  obtainable 
by  taking  one  off-diagonal  transition  probability  from  each  of  N-l 
rows  of  the  transition  matrix  in  such  a  way  that  no  cross-product 


involving  p  „  p^  is  obtained  for  any  i  and  j. 


iii)  The  numerator  N.  is  made  up  of  all  those  terms  of  D 

th 

which  do  not  include  a  factor  from  the  j  row  of  the  matrix. 

N-2 

N.  will  have  N  terms. 

J 

iv)  Any  one  off-diagonal  transition  probability  occurs  in 

.  TN- 2  .  lth,.,. 

N  terms  of  D,  l.  e.  ,  in  of  the  terms. 
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APPENDIX  M 


TABLE  OF  SIMULATION  RESULTS  FOR  4-STATE  STEADY  STATE 
BEHAVIOR  WHEN  THE  TRANSITION  PROBABILITIES  ARE  MULTI- 
DIMENSIONAL  BETA  DISTRIBUTED. 


Expt.  Sample 
No.  Size 

'14)ex. 

*1 

(iO 

2  ex. 

*2 

(ir-) 

3  ex. 

tt3 

No.  (out  of 

3)  within 

957.  Conf.  T.P.D. 
Region. 

■ 

39 

700 

.  2601 

.  2487 

.  1938 

.  1876 

.  2442 

.  2475 

1 

3.  5 

40 

700 

.  2032 

.  1873 

.  2579 

.  2443 

.  2787 

.  2844 

0 

5.9 

41 

700 

.  2926 

.  2926 

.  1688 

.  1655 

.  2653 

.  2728 

3 

1.  5 

42 

700 

.  2654 

.  2567 

.  2450 

.  2358 

.  2676 

.  2653 

1 

4.  0 

43 

1000 

.  2654 

.  2560 

.  2450 

.  23  50 

.  2676 

.  2661 

1 

4.  2 

44 

700 

.  2757 

.  2722 

.  2496 

.  2455 

.  2183 

.  2177 

1 

1.6 

45 

700 

.  2438 

.  2322 

.  2463 

.  2336 

.  2481 

.  2475 

1 

5.  0 

46 

700 

.  2889 

.  2792 

.  2029 

.  1823 

.  2375 

.  2366 

1 

6.  2 

47 

700 

.  2889 

.  2822 

.  2029 

.  1838 

.  2375 

.  2344 

1 

5.  8 

48 

700 

.  2598 

.  2533 

.  2322 

.  2235 

.  2632 

.  2663 

1 

3.  0 

49 

700 

.  2619 

.  2631 

.  2523 

.  2513 

.  2630 

.  2631 

3 

0.  2 

50 

700 

.  2334 

.  2334 

.  2632 

.  2625 

.  2757 

.  2766 

3 

0.  2 

51 

700 

.  2363 

.  2349 

.  2455 

.  2447 

.  2508 

.  2525 

3 

0.  4 

52 

1000 

.  2514 

.  2517 

.  3086 

.  3093 

.  2640 

.  2617 

3 

0.  5 

53 

1000 

.  2514 

.  2510 

.  3086' 

.  3091 

.  2640 

.  2625 

3 

0.  4 

54 

1000 

.  2514 

.  2504 

.  3086 

.  3084 

.  2640 

.  2642 

3 

0.  2 

55 

1000 

.  2514 

.  251  1 

.  3086 

.  3094 

.  2640 

.  2631 

3 

0.  2 

56 

1000 

.  2514 

.  2510 

.  3086 

.  3088 

.  2640 

.  2644 

3 

0.  1 

57 

700 

.  2593 

.  2597 

.  2827 

.  2843 

.  2352 

.  2343 

2 

0.  4 

58 

700 

.  2672 

.  2669 

.  2251 

.  2254 

.  2892 

.  2907 

3 

0.  3 

59 

700 

.  2661 

.  2656 

.  2459 

.  2454 

.  2231 

.  2235 

3 

0.  2 

60 

700 

.  2511 

.  2512 

.  2812 

.  2810 

.  2390 

.  2380 

3 

0.  2 

61 

700 

.  2638 

.  2639 

.  2425 

.2418 

.  2332 

.  2339 

3 

0.  2 

62 

700 

.  2407 

.  2412 

.  2586 

.  2579 

.  2555 

.  2558 

3 

0.  2 

(v  ,)  and  it,  can  be  obtained  from  u,  +  it  +  tt,  +  n.  =  1 
'  4  ex.  <*  1  2  1  4 

T.  P.  D.  is  defined  in  equation  (4.  4). 
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Legend  for  Experiment  Numbers  of  the  Above  Table 


Expt. 


No. 

M_ 

Comments 

39 

1, 

2, 

2, 

3/  3, 

1. 

1, 

3/ 

2, 

2, 

2, 

3/ 

2, 

1, 

2, 

1 

L  throughout 

40 

1, 

12, 

10, 

11/  8, 

3, 

11, 

7/ 

7, 

7, 

3, 

12/ 

7, 

11. 

11, 

2 

L  diagonal,  H 
off-diagonal 

41 

7, 

2, 

1, 

3/  3, 

8, 

3, 

1/ 

3, 

1, 

10, 

1/ 

3, 

1, 

2, 

11 

H  diagonal,  L 
off-diagonal 

42 

9, 

7, 

12, 

9/12, 

7, 

8, 

8/ 

9, 

12, 

7, 

7/ 

8, 

9, 

12, 

8 

H  throughout 

43 

18, 

14, 

24, 

18/24, 

14, 

16, 

16/18, 

24, 

14,  14/16, 

18, 

24, 

16 

2  x  No.  42 

44 

20, 

1  1, 

8, 

9/11, 

21, 

10, 

11/12, 

13, 

22, 

13/13, 

8, 

9, 

21 

H  off-diagonal, 
higher  diagonal 

45 

8, 

20, 

19, 

23/22, 

8, 

22, 

22/21, 22, 

10, 

19/18, 

19, 

19, 

11 

H  diagonal,  higher 
off-diagonal 

46 

4, 

8, 

3. 

2/  9. 

1, 

7, 

6/ 

6, 

1, 

9, 

7/ 

5, 

3, 

2, 

7 

Random  1  -  9 

47 

8, 

16, 

6, 

4/18, 

2, 

14, 

12/12, 

2, 

18, 

14/10, 

6, 

4, 

14 

2  x  No.  46 

48 

17. 

15, 

18, 

15/15, 

15, 

20, 

22/23, 

20, 

22,  19/21, 

18, 

17,  16 

Random  15-24 

49  24,  16,  21,  15/23,  20,  20,  1 8/ 1 6,  1 9,  1 9,  1  8/ 1 5,  20,  18,  15 

50  15,  23,  23,  18/23,  24,  24,  1 6/ 19,  1 9,  i 9,  1  5/  1  5,  1 6,  20,  22 

51  19,  22,  15,  22/15,  17,  24,  19/22,  23,  21,  23/22,  19,  22,  24 

52  1,  4,  3,  2/  3,  2,  3,  1/  4,  4,  2,  4/  3,  4,  3,  i 

53  2,  8,  6,  4/  6,  4,  6,  2/  8,  8,  4,  8/  6,  8,  6,  2 

54  5,  20,  15,10/  15,10,15,  5/20,  20,  10,20/  15,  20,  15,  5 

55  10,  40,  30,  20/30, 20, 30,  10/40, 40,  20,  40/30, 40, 30,  10 

56  15,  60,  45,  30/45,  30,  45,  15/60,  60,  30,  60/45,  60,  45,  15 

57  22,  36,  24,  29/37,  30,  28,  20/36,  36,  32,  39/29,  33,  29,  21 

58  36,  25,  34,  26/36,  36,  39,  34/34,  36,  40,  28/33,  21,  38,  27 

59  24,  32,  36,  29/27,  33,  21,  35/37,  25,  22,  24/34,  23,  23,  33 

60  36,  36,  33,  22/28,  39,  25,  29/27,  35,  33,  36/39,  33,  33,  31 

61  49,  39,  42,  33/42,  34,  30,  35/42,  43,  43,  42/30,  34,  30,  49 

62  49,  46,  47,  39/31,  38,  38,  39/30,  39,  45,  49/48,  45,  36,  31 


ii  n 

ii  n 

Random  1-4 
?  x  No.  52 
5  x  No.  52 
1 0  x  No.  52 
1  5  x  No.  52 
Random  20  -  40 


Random  30  -  50 
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APPENDIX  N 


THE  EXPECTED  MEAN  RECURRENCE  TIMES  OF  A  2-STATE 
PROCESS  WHEN  THE  TRANSITION  PROBABILITIES  ARE 
INDEPENDENTLY  E  ETA  DISTRIBUTED 


Consider  the  2-state  Markov  process  with 


1 


-  a 


a 


P  - 


b 


i  -  b 


whe  re 


Mx)  =  fp(x  I  rn  j  ,  n j) 


and 


fb(y)  -  nz)- 


For  a  and  "b  exactly  known  the  exact  mean  recurrence 


times  (n  ‘s)  are  found  as  follows: 
'  n 


n 


l  1 


come  right 
back  to 
state  1 

=  rpri  .  l*  + 


go  to  state 
2 


Mn  +1) 

<-  i 


and 

n21  -  (b)  •  i  +  (1  b)(n21+l) 
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These  two  equations  yield 


]_ 

b 


and 


11 


a  +  b  _  _1 
b  7 r. 


Similarly, 


12  a 


and 


—  _  a  +  b  _  1 

n22  a  7^ 

Now,  let  "a"  and  ''b"  be  independent  Beta  variables  as  indi¬ 
cated  above.  Then, 


E(nn,sE(ir)sl  +  E(*S)sl  +  EW  E(i 


.  .  a  and  b  are 
independent 


=  1  + 


— Ep-V 

"!+»!  lb/ 


m,  -1  n  -1 


m  r  1  i  2  ..  2 

(b)  J  y  P(m2>  n2)  y  y  y 


P(m2-l,n2)  (m2-2).'  (m2+n2-l).’ 

P(m  ,  n  )  ~  (m  +n  -2 ):  (m2-l).' 
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m 


-  1 


m  (m2+n2~l) 

E^nl  1 )  "  1  +  (rrij+n^tm^-l) 


=  1  + 


l1  m2+nz) 


m2  +  n2 


Similarly, 


m2(mi+n1  -1) 

E(n  )  =  1  +  7 - 7 - ;  TT  =  1  + 

22'  (m2+n2)(m1 -1}  —  _ 


(1_ml+ni) 


mi  +ni 


(N.l) 


and 


m  +  n  -  1 

E<*12>  =  m.  -T" 


1  - 


mi  +  nl 


mi  +  nl 


m  +  n  1 

E(n2i)=-^rnr- 


m2  +  n2 


m2  +  n2 
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APPENDIX  O 


STATE  OCCUPANCY  TIMES  WHEN  THE  TRANSITION  PROBA- 
BILITIES  ARE  MULTIDIMENSIONAL  BETA  DISTRIBUTED 


0.1.  Determination  of  the  Probability  Mass  Function  of  the 
Occupancy  Time 

Consider  an  N-state  Markov  process  'with  transition  matrix, 

P  =  (p..)  where  the  p  ,'s  are  multidimensional  Beta  distributed  with 
ij  ij 

parameters  m... 

ij 

Let  u  be  the  number  of  the  transition  on  which  the  system 
1 

leaves  state  i  for  the  first  time  given  that  it  is  in  state  i  before  the 
first  transition. 

From  equation  (4.  6), 

Pu  |  (k  |  x)  =  x^  ^  (1  -x)  k  2:  1  (O.  1 ) 

Ui  Pii 

We  want  to  know  the  marginal  probability  mass  function  on  u^, 

1 

P  (k)  =  1  p  |  (k|x)  f  (x)  dx  (0.2) 

u.  J  u  I  p.  p. . 

1  s'  ‘is  *11 


But 


P- 


(x)  = 


11 


v  1 


IJ 


(0.3) 


(from  section  3.2).  Therefore,  substituting  equation  (O.l)and  equa¬ 
tion  (0.3)  into  equation  (0.2),  we  obtain 
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(k)  -  ^  xk  1  (1  - 


x) 


(rn . . ,  21  m . .  \ 

11  j*  1J/ 


in. .  -1  mij  -  1 

c  11  (l-x)J#1dx 


pu  (k)  = 
1 


!  £  m. .  \ 

U*  1Ji 

(m.  ,+k-2)! 

(m  -1) 
11 

1 

(  £m.  ,4  k- 1  \i 
U  / 

k  >  1 


This  can  also  be  written  as 


S  m..\ 

j*  1J; 


pu  (k) 

i 


Z  m  . 
j*  1J 
\  X  m. 


(m  )(m.  .+1 )  .  . 

ii  li 


.  (m  ,+k-2) 

ii 


/  £  m.  \ 

I  Sm  ,4l\  ... 

IX m  ,4-k-l  \ 

•  ij 

■  ij 

'  J 

\  J  / 

\  J  / 

k  =  1 


k  >  2 


This  last  form  can  be  normalized  by  letting 


and 


P  .  = 


11 

N 

V 

m, 

i,; 

1 

N 

x 

m  . 

j=i 

_ 

_ 

for  then 


(0.4) 


(0.5) 


(0.6) 
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Pu(k)  = 
i 


,  p..  \/p  .  +  W.\ 


K'-V 


p..+(k-2)w^ 
l+(k- 1 )w.  I 


k  >  2 


(0.7) 


k  =  1 


0.2.  Bounds  on  the  Mean  Occupancy  Time 


Using  equation  (O.  4),  the  mean  occupancy  time  is 


(0.8) 


There  is  no  apparent  way  to  obtain  this  summation  in  closed  form.  How¬ 
ever,  a  reasonable  bounding  technique  has  been  developed.  We  are 
primarily  concerned  with  an  upper  bound  since  a  lower  bound  can  easily 


be  obtained  by  truncating  the  summation  at  a  finite  number  of  terms. 

The  process  must  eventually,  leave  state  i.  Therefore, 

00 


I 


Pu  <k)  =  1 


Hence,  from  equation  (O.  5) 


where 


'l 

\r  =  ' 


(m  +k-2); 

li _ 

/2m  +k-l\  l 

li  1J  > 


=  i 
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is  independent  of  k.  Therefore, 


I 


(m.  +k-2)I 
11 _ 

Dm  ,+k-l\ I' 

li  1J 


or,  more  generally, 


Y  (a+k-2) !  _  (a-l)l 

L  (b+k-1)]  =  (b-l )!  (b-a) 
k—  1 


(b>a>l) 


(0.10) 


Now,  equation  (O.  8)  gives 


^  k(m..+k-2)! 
E(ui)  =  c  2,  /  Sm.  ,+k-l\  ! 

k*i  v  j  1J  ) 


oo 


<  c 


l 


(m.  +k-2)i 

n _ 

(T  m.  ,+k  -2 \  ! 

lj  U  I 


(m.,-1)! 

_ 11 _ 

c  IT  m  ,-2\;  /  E  m  ,-l\ 

li  ^  )  (j*  1J  ) 


(using  equation  (O.  10)) 


That  is, 


Hence,  E(u  )  can  be  bounded  as  follows: 
i 


(0.11) 
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where 


Then, 


A.PPENDIX  P 

DETERMINATION  OF  THE  EXPECTED  VALUES  OF 
THE  TRAPPING  PROBABILITIES  IN  A  SPECIAL 
TRAPPING  STATES  PROBLEM 


Consider  the  Markov  process  with  the  transition  matrix 


1 

P  =  2 
3 


1  ‘  ql  ‘  q2 


0 

0 


2 


q j  and  are  multidimensional  Beta  distributed.  That  is, 


f  „  <x  y)  =  ylm  ,m  ,m  ) 

cl1-q2  P  2  3  1 


P(mi;  m2>  m3) 


m,  -1  m  -1  m  -1 
n  .  1  2  3 

(1-x-y)  x  y 


0  ^  x;  0  S  y;  and  x  +  y  £  1 . 


Fo 


r  known  q  and  q  let  g.  =  pr(process  traps  in  state  j). 
*  2  J 


g2=q1  +d-qrq2)  q}  + 


=  qjf.1  +  -qj  -q2)  +  u  +  •  •  •  ] 


(P.  i) 
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When  □ 
41 


and 


become  random  variables, 


from  equation  (P.  1) 


E(g2)  =  E(q:)  +  E[qx<l  -q:  -q2)]  +  Efq^l  -ql  -q^  ]  + 


But 


E[qi(l-qi-qz>  ]  (j(m^  ,  m2>  m^) 


1  1  -x  ..  .  . 

m1+k-l  m2  m3-l 

\  \  (1-x-y)  x  y 


0  0 


Substituting  z  =  y/(l-x)  and  performing  the  double  integration  gives 


P(m  ,  m  +k) 

E[q,  (1  -q,  -q,)  ]  =  - - -  p(m  +1 ,  m  +m  +k) 

L  1  1  lZ  1  p{m^ ,  m2>  my  2  1  3 


k=0 


P(m3>m1+k)  p(m2+l ,  m^+m^+k) 


P(m^ ,  m2>  m3) 


This  simplifies  to 


E(g2)  = 


m^  Hm1+m2+m3)  (irij+k-1)! 


r(m3) 


I 


(P. 


k=0  (™1+^2+m3+k)- 


Similarly, 


CO 


E(g3)  = 


m 3  r(m1+m2+m3)  ^  (m^+k-1)! 


r(mi) 


1 


k=0  (m1+m2+m3+k>: 


dvdx 
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But  we  know  that  E(g^)  +  E(g^)  -  1.  Therefore, 


or 


r(m^m2+m^(m^m^) 

fimj) 


l 


(rrij+k-1)!  _  ^ 


k=0  (m!+m2+m3+k)! 


l 


(m  +k-l)' 


rtnij) 


k_0  (mj+m^nij+k)!  (m2+m3)  rimj+m^m,) 


(P.3) 


NOTE: 


This  checks  with  the  identity  (equation  (0.10))  developed  in 
Appendix  O  if  in  the  identity  we  substitute  a  =  +1  and  b  = 

+  +  1 . 


Substituting  equation  (P.3)  into  equation  (P.2),  there  follows 


E(g2)  = 


m2  +  m3 


and 


E(g3)  = 


m2  + 
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APPENDIX  Q 


PROOF  THAT  E(N.  R.  ]  m,  n,  a;  k)  =  E(N.  R.  [  m,  n,  a) 


Q.l.  ALGEBRAIC  PROOF 

Equations  (5.4),  (5.2)  and  (5.3)  give 


k 

E(N.R.  |m,n,a;k)  -  ^  p%(rj  m,n,k) 

r  =0 


(5.4) 


•  E(N.R.  |m+r,  n+'k-r,  a) 


p  (r|  m,  n,  k) 
pb 


(m+r-1 )'.  (n+k-r  -1 ) !  k!  (m+n-1).' 
r.'  (m-l)i  (k-r)!  (n-1)!  (m+n+k-1).' 


(5.2) 


and 


E(N.R.  |m,  n,  a)  =  sjjj  -  c  tt^-i^)  ^  +~  f|i  n|m+n|  a+1^J  (5. 


3) 


Substituting  equation  (5.2)  and  equation  (5.3)  into  equation  (5.4)  gives 


E(N.R.  I  m.  n,  a;  k)  =  s(r  -c) 


a  ^  (m+r-1):  (n+k-r-1).'  k!  ^m+n-1).' 

+  ®(r2'rl)  7TT  L  r.'  (m  -1 )!  (k-r)!  (n-l)!~  (Q.l) 

r=*0 

F(1  n+k-r  |  m+n+K  |  l/(a+i )) 

(m+n+k-1)! 
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Now,  from  equation  (G.  3)  of 


there  results 


G  we  know  that 


/  1  \  p(p+l)  /  1  \2 
\  a+lj+  q(q+l)  \a+l /  + 


(Q.  ?•) 


H)).  Therefore,  both  E(N.  R.  [ m,  n,  a;  k) 
ght.  of  as  power  series  in  (l/(a+l)). 

[  1  / (a+ 1 ) )“*  in  E(N.R.  |m,  n,a;k).  Then, 


(n+k-r-l)!  k!  (m+n-1 )! 
(k-r).'  (n-l)J  (m+n+k-1)! 


.  (n+k-r+j-2) 

.  (m+n+k+j-2) 

(n+k+j-r-2).'  k.1  (m+n-1).' 
(k-r).'  (n-1).'  (m+n+k+j-2).' 
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sa(r2-r1)(n+j-2)!  (m+n-1)! 
(m+n+j-2)!  (n-1)! 


k 


y 
■ _ , 


(m+r-l)I  (n+k+j-r-1)!  k!  (m+n+j-2)] 
r!  (m-1)!  (k-r)!  (m+n+k+j -1 ) !  (n+j-2)! 


<Wr 


in,  n+j-1 ,  k) 


•1  = 


Therefore , 


sa(r  -r  )n  (n+1) - (n+j-2) 

A  _  - £ - - - 

j  (m+n)(m+n+l)  ...  (m+n+j-2) 

Now,  using  equations  (5.  3)  and  (Q.2),  this  is  seen  to  be  the  coefficient 
of  (l/(a+l  ))J  in  E(N.  R.  |  m,  n,  a).  Hence, 

E(N.  R.  |  m,  n,  a;  k)  =  E(N.  R.  |  m,  n,  a)^ 

the  result  that  was  intuitively  anticipated  at  the  beginning  of  section  5.  2.  2. 
An  interesting  side  issue  is  that  the  following  identity  has  been  proved 

k 

F(l,n|m+n|a)  =  )  p  (r  I  m,  n,  k)  F(1 ,  n+k-r  [  m+n+k  |  a) 

U  PD 
r  =0 


Q.2.  Alternate  Proof  of  WhyE(N.R.  m,  n,a;k)  is  Independent  of  k, 
the  Number  of  Transitions  Observed 

The  following  theorem  (that  is  of  interest  for  decision  theory 
in  general)  will  enable  us  to  justify  the  independence  of  E(N.R.  |m  n,  a;k) 
from  k.  j  • 

Theorem :  Consider  ;  random  variable  x.  Suppose  R(x)  is  a 
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'pure"  function  of  x;  that  is,  it  is  independent  of  the  distribution  of  x 

2  _ 

(this  rules  out  such  situations  as  F(x)  =  c(x-x)  because  the  mean,  x, 

is  a  function  of  the  distribution).  Let  there  be  an  experiment  G  whose 

outcome  y  is  a  function  of  x;  more  precisely,  the  likelihood  function 

is  f  i  (y  x  ).  Then  the  expected  posterior  mean  of  R  before  per- 
y  |  x  o  o 

forming  the  e-xperiment,  given  that  it  will  be  performed,  is  equal  to  the 
mean  of  R  without  experimentation. 


Proof: 


E(R|y  )  =  C  R(x  )  f  .  (x  |y  )  dx 
•o'  J  O  X  y  o  'o  c 


C  ° 

=  j  R<V - 7 


f  |  (y  x  )  f(x  ) 

y  |  X  O  O  O 

~  :  ax  (Bayes) 

y  )  o 

y  'o 


expected  posterior  mean  =  ^  E(Ri  y q)  f^(yQ)  dyQ 


f  I  (y  j x  )  f  (x  ) 
yl  o  o  x  o 


ny  i  —  ~  w  ^  w 

R(xo)  TT7  dx-¥^!  dy. 


y,_ 

U  w 


R(x  )  f  (x  )  (  1  f  (y  |x  )dy  \dx 
o  xo^J  y  |  x  o  o  1  o  J  c 


=  \  R(x  )  f  (x  )  dx 
•  J  o  X  o  o 

X 

o 


=  E(R)  Q.E.D. 
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*NOTE. 


If  R(x)  was  a  function  of  the  distribution  of  x,  R{x  )  would 

o 

have  been  a  function  of  the  experimental  outcome  y^  and  we  could  not 
have  performed  the  simple  integration  on  y  .  Hence,  the  proof  would 
not  be  valid. 

Now  in  the  2 -state  problem  considered,  because  we  cannot  con¬ 
trol  the  decision  mechanism  after  the  observations,  E(N  R.)  becomes 
a  "pure  "  function  of  the  random  variable  '"b".  Hence,  E(N.  R.  j m,  n,  a;  k) 
=  E{N.  R.  |  m,  n,  a) . 
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APPENDIX  R 


DETERMINATION  OF  THE  EXPECTED  VALUE  OF  PERFECT 


INFORMATION  IN  A  SPECIAL  2-STATE  MARKOV  PROCESS 


Consider  a  2-statc  Markov  process  with  transition  matrix 


P  = 


I  1  -  a  a 
b  1  -  b 


where  "a"  is  known  exactly  but 
fb  (bo>  =  fp  (b0lm’n) 


Let  c  be  the  fixed  cost  per  period  for  using  the  process.. 

r.,  the  reward  per  time  period  for  being  in  state  j.  (j  =  1,  2). 
and  d,  the  cost  of  observing  a  transition  from  state  2. 


If  it  is  desirable  (i.  e.  the  expected  revenue  is  positive),  we  can 
use  the  process  for  s  periods  in  the  steady  state,  Also,  we  have  the 
option  of  buying  the  right  to  observe  k  transitions  from  state  2. 


(R.  1) 


In  section  5.  2.  3  it  was  shown  that  the  expected  net  revenue 
following  an  optimum  procedure  is 

r  Observe  E  (N.  R.  |  m,  n,  a;  k) 

E  [N.  R.  J  f^(b  )  ]  =  max  K  Use  E  (N.  R.  |  m,  n,  a) 

°’  U>  S  (  Stop  0 

To  evaluate  E  {N.  R.  |  P.  I.  ),  we  proceed  as  follows. 

For  a  known  value  of  "b",  say  b  ,  ( rr  a  —  the 

1  o  2  ex  a  +  bo 

exact  probability  of  being  in  state  2  in  the  steady  state  and  the  expected 
reward  per  period  in  the  steady  state  is 

R  "  rl  ^  i*ex  +  r 2  ^  2*cx  "  C 

=  rl  '  C  +(r2  '  ri>  ^cx 


rl-C  +  (r2-rl>ITF 
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We  would  use  the  process  if  R>  0 


i.e.  r j  -  c  +(r2  -  r  J  rTT~ 


>  0 


(rj  -  c)  (a  +•  bQ)  +  (r 2  -  r^  a  >  0 
or  use  only  if  bQ  (r  -  c)  +  {r ^  -  c)  a  >0 


..  (R.  2) 


If  c  <  both  r,  and  r2>  we  will  always  use  the  process. 

If  c  >  both  r,  and  r2>  we  will  never  use  the  process. 
Therefore,  wc  need  only  consider  c  between  r^  and  r2- 


Case  i  r^  <  c  <  ^ 


(r2  -  c)  a 


Inequality  (R.  2)  becomes  —  use  process  when  bQ  <  - - - - - 

(r?  -  c)a 

Therefore,  ifb_<  — - — - —  ,  the  E  (N.  R.  |  bR)  =  s  [r^c+^-rp  ] 


o  c  -  r 


if  b^>  (r2  "  c)a  ,  the  E(N.R.|  b0)  =  0 


c  -  r 


Now,  from  equation  (5.  11) 

E  (N.  R.  |  P.  I. )  =  X  E  (N.  R.  |  bQ)  fb  (bQ)  dbQ 


(r2-c)a 


=  J*  c'ri  s  [rrc+(r2-ri)OT-]fp  (b0lm’n)  db0 


J, 


r2*c>  a 


(0)  fp  (bo|  m,  n)  db£ 


c-r  , 
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a 


(r2-c)a 


a 


.  .  .(R.  3) 


'*2 

i.  e.  E  (N.  R.  |P.  I. )  =J  C  "  r  1  s[ri-c+(r,-ri)^]fp  (bjm,  n)  dbQ 


Finally,  (R.  1)  and  {R.  3)  would  be  used  to  give 

E.V.P.I.  =  E  (N.  R.  |  P.  I. )  E  (N.  R.  |fb(bQ)  ]  .  .  .  (R.  4) 

Case  ii  r2  <  c  <  Tj 


Inequality  (R.  2)  becomes  —  use  process  when  bQ  > 


(c-r2)a 


1 

Consequently,  E(N.  R,  j  P.  I. )  =  J"  s[r  J (bQ |  m,n)dbQ 


(c-r  -,)a 


rl-c 


Then  proceed  as  above. 


.  .  .(R.  5) 


I 
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APPENDIX  S 


THE  USE  OF  A  GAMMA  PRIOR  ON  THE  SECOND 
PARAMETER  OF  A  GAMMA  DISTRIBUTION27 


S.  1  Justification  for  the  Gamma  Prior. 

Consider  a  Gamma  variable  t  with  parameters  a  and  p  where  a 
is  assumed  exactly  known  but  p  will  be  a  random  variable. 


ft|p  (xly)  =  (x  I"  *  y)  = 


a  a  -1  -yx 
y  x  e 


r(a) 


0  < 


(s.  1) 


Let  E  be  the  event  that  k  independent  draws  from  this  distribution  take 
on  the  values  x^,  x^»  ...  ,  x^ 


Clearly  pr  (E  |  y)  = 

=  t 

pr  (E|y)  = 


k 

II 

i=l 


a  o-l  -yx. 
/  x.  i 


r  (a) 


v  . 

3  ,  J  (x.X0 
IW  1  2 


a  - 1 


x  ) 

X 


(X1X2- 

tr(«)  )k 


ak  -y  ^ 
y  6  i  =  lXi 


Kernel 


■  -  .(S.  2) 


This  suggests  using  a  prior  on  p  of  the  form 
C  p V"  1  e'WP 

but  this  is  seen  to  be  a  Gamma  distribution  with  parameters  v  and  w,  i.  e.  , 
the  conjugate  prior  is 

(y)  =  fT  (ylv’w)  =  ^  y^'1e'Wy  0  £  y  •  •  •  (S.  3) 

p  r(v) 


2  7  _  -  - _ 

Part  of  the  results  of  this  appendix  have  been  stated  (but  without  proof) 
in  an  article  by  Scarf. 

Scarf,  H  E,  ,  "Some  Remarks  on  Bayes  Solutions  to  the  Inventory 
Problem",  Naval  Research  Logistics  Quarterly,  Vc-1.  7  (1960), 
pp.  591-6. 
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S.  2  The  Marginal  Distribution  of  "t"  and  its  Moments 


ft(x)  =  /ft|p  U|y)  £p  (Y)  dy 


r°°  a  <2-1  -vx  v  v-1  -wy 
j  y  x _ e  ’  w  y  e 


dy 


0  r(“) 


r-(v) 


v  a  - 1  fP° 

w  x _ 


I[a)r  (v) 


j  y»+v-l  e-(x+w)y 


dy 


V  0-1 

w  x 


r(or)  r  (v) 


-T(  a  +  v) 
(x  +  w)«+v 


o  -1 


1t(x)  -  p  (  a  .  v)  *  (x  +  w)  or  V 


0  £  x 


(S.4) 


oo 


J|"  v  a 
x  f  (x)  dx  =  )  — — - — 


0  P(ar,v)(x+w) 


a+v 


This  can  be  integrated  to  obtain 


E  (t)  = 


_  ™ _ _  •  ZL  (a\  Lilli 

P  (  a  ,  v)  k=0  l  k  j  a  +  v- 


?  -k 


provided  v  i  2. 


Unfortunately,  this  is  a  rather  complicated  function  of  the  parameters 
v  and  w.  The  situation  with  the  variance  will  be  even  worse.  As  was 

done  in  the  discrete  time  problem,  we  would  like  to  choose  v  and  w  to 

v 

satisfy  our  prior  estimates  of  E  (t)  and  t.  This  would  be  very  difficult 

v 

with  such  complicated  expressions  for  E  (t)  and  t.  Hence,  we  restrict 
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attention  to  the  special  case  of  a  -  1,  i.  e.  the  density  function  of  t 


given  pis 

ft|  p  y)  =  fe  <x|  ^  =  ye"yX  o  <  X  .  .  .  (S.  5), 

the  exponential  distribution  with  mean  — i—  . 


Setting  a  =  1  in  (S.  4),  we  obtain 


f.  (x)  =  ---W  '-7  0  £  x  (a  =  1)  .  .  .  (S.  6) 

1  (x+w)v+1 


0 


E  (t)  -  - r  for  v  >  2  and  a  =  1 

v-  1 


.  .  .  (S.  7) 


In  the  same  way  wc  find  that 

F  ft2l  =  2w.2 

E  '  (v-2)  (v-l) 


V 

t 


V 

t 


2  2 

E  (t  )  -  [E  (t)  ]  which  simplifies  to 
2 

- ^ -  for  v  >  3  and  a  -  1 

(v-lp  (v-2) 


.  ..  (S.  8). 
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S.  3  Bayes  Modification  of  the  Gamma  Prior 


Let  the  event  E  be  as  defined  in  Section  S.  1 


Now  f  (y|E)  -  pr  (E  |  y)  f^  (y) 


k  k 

,  21  .v+ak  v+ffk-1  -y(w+  .2  x.  j 

=  (w  +  i  =  1  X.  )  y  e  i  =  1  1 

r  (v  +  ok) 

i.  e.  fp  (y|  E)  =  f^  (y|  v  +  ok,  w  +  x.),  a  Gamma  distribution 

with  modified  parameters.  Note  that  the  individual  x.  values  are  not 
important ;'all  we  need  know  is  their  sum. 
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APPENDIX  T 


THE  USE  OF  A  NORMAL  PRIOR  ON  THE  MEAN  OF  A 

28 

NORMAL  DISTRIBUTION 


T.  1  Justification  for  the  Normal  Prior. 

2 

Consid  >r  a  Normal  variable  whose  variance  c  is  assumed 
exactly  known,  but  whose  mean  p  will  be  a  random  variable. 


f„,„  (x  |  y)  =  fN(xl 


r  Hi 


(r2)  = 


'JZtt  cr 


exp 


-(x  -  y) 

~ — Z 

2  <r 


_  00 


<  x  < 


Let  E  be  the  event  that  k  independent  draws  from  this  distribution 

take  on  the  values  x.,  xOI  x,  . 

1  £  k 


k 

Clearly,  pr  (E|  y)  = 


which  can  be  expressed  as 


/-(y-a) 2  \ 

pr  (E  |  y)  =  C  exp^  2b  a  j 

Kernel 


where  a,  b  and  c  are  constants 


This  suggests  using  a  prior  on  p  of  the  form 


IS 


29 


A  portion  of  the  results  of  this  appendix  have  been  given  by  Raiffa 
29' 

and  Schlaifer.  However,  for  completeness,  they  are  rederived 
here  with  slight  notational  changes. 
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oo 


2 

This  is  merely  a  Normal  distribution  with  mean  v  and  variance  w<r  , 
i.  e.  ,  the  conjugate  prior  is 


Vy) 


fN(y 


2. 

WT  ) 


-  00  <  y  <  00 


T.  2  The  Marginal  Distribution  of  "r  "  and  its  Moments. 


After  performing  the  integration  on  y  and  considerable  simplification 
there  results, 

1  /-(x-v>2  \  .  2 
fr  =  ^(wTT)o-  eXP\2(w+l)e-2y  =  fn(xlV>  (W+1)"  > 

2 

Hence,  r  is  normally  distributed  with  mean  v  and  variance  (w+l)o- 
Therefore,  E  (  r)  -  v  and  ^  =  (w+1)  2. 


T.  3  Bayes  Modification  of  the  Normal  Prior. 
Let:  the  event  E  be  as  defined  in  section  T.  1. 

Now  f  (y  |  E)  =  pr  (E  |  y)  f  (y) 

Pr  (E) 


(T.  2) 


(T.  3) 
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After  considerable  straight-forward  manipulations  (such  as  completing 
the  square  in  "y")  there  follows 


tion  with  modified  parameters.  Again  note  that  the  individual  values 
are  not  important!  all  we  need  know  is  their  sum. 
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GLOSSARY  OF  SYMBOLS 

! 

Symbol 

Meaning 

i 

Page 

a. . 

Transition  rate  from  state  i 

127 

to  state  j  in  a  continuous  time 
process  (i#j) 

A 

A  =  (a  ^  matrix 

127 

B 

Event 

137 

B(F) 

Event 

9 

(B(F),  s) 

Joint  event  that  B(F)  occurs 
and  the  starting  sLate  is  s 

12 

c 

Cost  per  period  for  using  a 

Markov  process 

93 

c. . 
ij 

Cost  of  assuming  that  matrix 

1P:is  in  use  when  JP  is  really 
governing  the  process 

9  i 

C(k) 

k 

Cost  of  assuming  matrix  P 

»  i 

c.ov(p,  q) 

Covariance  of  p  and  q 

37 

d 

Cost  of  observing  a  transition 

93 

d 

r  s 

Cost  of  observing  a  transition 
from  state  r  to  state  s 

24 

E,  E 

r 

Events 

35.  94 

Ek 

Error  in  E(“^)  when  only  k 

50 

terms  are  used  in  the  hyper- 
geometric  expansion 

E(x)  =  x 

Expected  value  of  the  random 

37 

variable  x 

-24 

j 

Symbo  L 

Meaning 

Page 

E(N.R. 

|  m  .  n,  a) 

Expected  net  revenue  in  a  2- 

state  process  given  that  one  transition 

probability  is  known  exactly 

at  "a "and  the  other  is  Beta 

distributed  with  parameters 
n  it  .  ii  n 

m  and  n 

95 

’  ; 

E(N.R. 

k) 

j  m.  n  a; 

Same  as  above  except  k  obser¬ 
vations  of  transitions  related 
to  the  unknown  probability 
will  be  taken  (does  not  include 
the  cost  cf  observations) 

95 

E(N.R. 

I'”,  j» 

Expected  net  revenue  following 
an  optimum  policy  without 
experimentation  when  the  transition 
probabilities  are  multidimen¬ 
sional  Beta  distributed  with 

parameters  (m..) 

iJ 

108 

E(N.R. 

k) 

Same  as  above  except  k  transi¬ 
tions  will  be  observed 

105 

t 

E(N.R. 

ly*0» 

Expected  net  revenue  following 
an  optimum  policy  given  that 
the  random  variable  x  has  the 
density  function  shown 

112 

E(N.R. 

(m,  )) 
iJ 

1  h,  i , 

Expected  net  revenue  in  the 
next  h  periods  given  that  a 
transition  will  occur  from 
state  i  just  before  the  first 
period  and  the  transition  proba¬ 
bilities  are  multidimensional 

Beta  distributed  with  param¬ 
eters  (m. .) 

iJ 

122 

E(N.  R. 

|P.I.) 

Expected  net  revenue  following 
an  optimum  policy  given  perfect 
information  about  the  unknown 
parameter(s) 

112 

E.  V.P. 

I. 

Expected  value  of  perfect  informa¬ 
tion 

112 
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- 

f  Jw,  X,  ..  .  ,z) 
a,  b,  .  .  .  ,  d 

Joint  density  function  of  the 

random  variables  a,b . d 

evaluated  at  the  point  (w,  x, 

36 

. .  ..  z) 

fa|b(w|x) 

Conditional  density  function 
of  the  random  variable  "a" 

36 

evaluated  at  the  point  "w" 
given  that  the  random  varia¬ 

ble  "b"  has  taken  on  the  value 

ii  ii 

X 

f 

T 

Number  of  times  that  the  proc¬ 
ess  has  been  in  stale  i 

40 

f 

Number  of  transitions  from 

40 

state  i  to  state  j 

fN(ro^o'"2) 

Normal  distribution 

140 

fp(xl>x2 . xkl 

m  ,  m2>  ...  mk) 

Multidimensional  Beta  distri 

bution 

37 
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S/mbol 
f  (x[v,  w) 

y  . 

F(a.  c-b|c[  z) 


F(p.  q  I  r  I  z) 


gj 

M  =  (m  .) 
ij 


M 


L.S. 


5je  sjt 

ml  nl 


n.  . 
ij 


N 


Meaning  Page 

Gamma  distribution  128 

Hyper  geometric  function  197 

(integral  form) 

Hypergeometric  function  197 

(series  form) 

Probability  of  trapping  in  state  j  88 

Parameters  of  the  multidimen-  103 


sional  Beta  prior  distributions 
on  the  probabilities  of  a  Markov 
transition  matrix 

Value  assigned  to  the  sum  of  38 

the  parameters  of  a  single 
multidimensional  Beta  distri¬ 
bution  by  the  least  square* 
technique  of  Appendix  C 

Smallest  integers  for  which  59 

a  -  m^  /(m^+nj )  takes  on  a 

specified  value 

Number  of  occurrences  of  the  35 

i  category  of  a  multinomial 
distr ibution 

Number  of  transitions  to  get  72 

to  state  j  for  the  first  time 
given  that  the  process  is  in 
state  i  before  the  first  transi¬ 
tion 

Mean  recurrence  time  for  state  i  15 

k 

given  that  matrix  P  is  being 
used 

Number  of  states  in  a  Markov  - 

process 


1 
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T0<<  T3< 


Symbol 


N(F) 


pu 


Pii 


P  (k) 
x 


P 


y 


(k|j) 


Ppb(r  lm>  n> k) 


k_  ,k  . 
p  -  < 


a 


Meaning 

Number  of  possible  sequences 
producing  the  transition  fre¬ 
quency  count  F  =  (f,  .) 


Variance  of  p 

Value  assigned  to  the  variance 
of  p  by  least  squares  technique 

Probability  that  a  random  draw 
th 

falls  in  the  i  category  of  a 
multinomial 

Probability  that  the  state  at  time 
n  +  1  is  j  given  that  the  state 
at.  time  n  is  i 

Expected  value  of  diagonal  transi  - 
th. 

tion  probability  of  1  row  of 

matrix' p  ,  =  m  /  Em.. 

'  u  11  .  11 

J  J 

Probability  that  the  discrete 
random  variable  x  takes 
on  the  value  k 

Conditional  probability  that  the 
discrete  random  variable  x 
takes  on  the  value  "k  "  given  that 
the  variable  "y1'  assumes  the 
value  j 

Beta- binomial  probability  mass 
function 

,  th  ,  . 

k  possible  transition  matrix 
in  a  multi-- matrix  Markov  proc¬ 
ess 

Number  of  possible  transition 
matrices  in  a  multi-matrix 
Markov  process 


Paee 


10 


37 

178 

35 


40 


68 


1  5 


15 


94 

9 


8 
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Symbol 

Meaning 

Pagi 

r . 

J 

Reward  per  period  for  being 
in  state  j 

69 

r  . 
ij 

Reward  per  transition  from • 
state  i  to  state  j 

21 

R 

Expected  reward  per  period  in  the 
steady  state  when  the  transition 
probabilities  and  rewards  are 
known  exactly 

92 

R  (n) 

i 

Expected  reward  in  the  next  n 
periods  given  that  the  present 
state  is  i  and  the  transition 
probabilities  and  rewards  are 
known  exactly 

148 

Rt 

Expected  reward  per  trs.nsition  in 
the  steady  state  when  the  transi¬ 
tion  probabilities  and  rewards 
are  known  exactly 

144 

s 

Number  of  periods  in  the  steady 
state  that  tiie  process  will  be 
used 

94 

Sk 

Probability  that  the  process 
starts  in  state  s  given  that 

12 

matrix  P  is  being  used 

s_ 

7 r 

J 

Sample  standard  deviation  of 
the  mean  of  the  steady  state 
probability  of  being  in  state  j 

63 

s  . 

Total  observed  reward  in  f. . 

ij 

transitions  from  state  i  to  state  j 

147 

t  (n) 
ij 

Expected  number  of  times  in 
state  j  in  the  next  n  periods 
given  that  the  present  state  is  i 

149 
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Symbol 


T 


k 


T.P.D. 


u. 


1 


vh(a',i) 


V.  w. . 

1J  1J 


v.(m,  n) 


w. 


i 


a 


Meaning 

k 


Total  per  cent  deviations  of 
the  sample  means  from  the 
exact  steady  state  values 
N 

T.P.D.  =  100  S  I  (it  )  -  tr  I 

j=l  J  CX  J 

Number  of  the  transition  on  which 
the  system  leaves  state  i  for  the 
first  time  given  that  it  is  in  state  i 
before  the  first  transition 


Mean  number  of  transitions  to  first 
leave  state  i  given  that  the  process 
is  in  state  i  before  the  first 
k 

transition  and  P  is  the  transition 
matrix 

Expected  cost  if  an  optimal  policy 
is  followed  and  we  are  in  state  i 
with  probability  vector  a'  over 
the  matrices  of  a  multi-matrix 
piocess  and  there  are  h  decision 
periods  left 

Parameters  cf  the  prior  distri¬ 
butions  in  the  Exponential  - 
Gamma  and  Normal -Normal 
frameworks 

Value  function  of  a  special  2 
state  Markov  decision  problem 

w 

i 


i)  Discount  factor 
ii)  Probabilitv  that  P  is  being 


m 


ij 


139 

63 

76 

19 

2.0 

146  147 

119 

77 

101 

26 
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Symbol 

Meaning 

Page 

used  when  there  are  only  2 
possible  matrices  in  a  multi¬ 
matrix  process 

9 

Qk 

Probability  that  P  is  being 
used  in  a  multi-matrix  proc¬ 
ess 

8 

P(m1 ,  m2>  .  .  , 

mk) 

Generalized  Beta  function 

36 

Change  in  E(R^)  caused  by  the 
observation  of  a  reward 
of  value  x  to  x  +  dx  for  a  transi¬ 
tion  from  state  k  to  state  m 

155 

k 

Tt 

S 

Steady  state  probability  6f  being 
in  state  s  given  that  matrix 

12 

k_  ,  .  , 

P  is  being  used. 

i 

if. 

J 

Sample  mean  value  of  n. 

63 

(r.) 

J  ex 

Steady  state  probability  of  being 
in  state  j  when  the  transition 
probabilities  are  assumed  exactly 
known  at  their  mean  values 

63 

O’ 

X 

Standard  deviation  of  x 

— 

tP.j(n  |  P) 

Probability  that  the  state  at 
time  n  is  j  given  that  the  state 
at  time  0  is  i  and  the  transi¬ 
tion  matrix  is  P 

82 

<t>.  .(n) 
ij 

Unconditional  (marginal)  proba¬ 
bility  tnat  tne  state  at  time  n 
is  j  given  that  the  state  at  time 

0  is  i 

82 

V> 

Probability  that  the  state  at 

13 

i  1 
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Symbol 


[<Mn|a,  b)] 
x |  as  y f 


Meaning 

time  n  is  j  given  tjiat  the  state 
at  time  0  is  i  and  P  is  being 
used 


*  denotes  that  a^  is  replaced 
by  a^  and  b^  by  b^ 

x  monotonically  decreases  as 
y  increases 


BIOGRAPHICAL  SKETCH 


Edward  A.  Silver  was  born  in  Montreal,  Canada  o  1  June  13, 
1937.  He  attended  grammar  school  and  West  Hill  High  School  in 

Montreal. 


Mr.  Silver's  undergraduate  education  was  taken  a1  McGil’ 
University  where  in  May,  19  59  he  received  a  Bachelor  o  Civil 
Engineering  (Applied  MLechanics  Option).  While  u.i  Mi  ^ji  ,  . 

Silver  was  the  recipient  of  several  scholarships  and  at  th  2  1959 
convocation  he  was  named  a  university  scholar  and  was  awarded 
the  C.  Michael  Morssen  Gold  Medal  for  Great  Distinction  and 
Engineering  Promise,  a  British  Association  Medal  for  Croat  Dis¬ 
tinction  and  the  Robert  Forsyth  Prize  for  Theory  of  Structures. 

Since  entering  graduate  school  in  September,  191*9,  Mr. 
Silver  has  been  a  research  assistant  in  the  M.  I.  1.  Operations 
Research  Center.  His  work  in  chis  capacity  has  led  to  tre  pre¬ 
sentation  of  a  paper  entitled  "The  Use  of  the  Hypergeometric 
Function  as  Part  of  Bayesian  Estimation  in  «t  Two-State  Marl<o\ 
Process"  at  the  May  1963  National  meeting  of  the  Operations 
Research  Society  of  America  in  Cleveland,  Ohio.  .  He  it  «ilso  the 
author  of  "The  Transient  Solutions  for  3-Statc,  Discrete  Time, 
Markov  Processes",  a  publication  of  the  M.  1.  T.  Operations  Re¬ 
search  Center.  During  his  first  year  at  M.  I.  T.  ,  Mr.  Silver  held 
a  Johnson  Foundation  Fellowship. 

Mr.  Silver  is  a  member  of  Tau  Beta  Pi,  Sigma  :<i  and  Phi 
Epsilon  Alpha  honor  societies,  as  w'ell  as  the  Operation!  Research 
Society  of  America,  the  Institute  of  Management  Sciences,  and  the 

Engineering  Institute  of  Canada. 


